有些时候你可能会导出大量的JSON数据到文件中,或者说将所有数据导入到JSON文件。
和任何大数据集一样,您不能只将其全部放入内存并将其写入文件。 它需要一段时间,它从数据库中读取大量条目,您需要注意不要使这些导出过载影响整个系统,或耗尽内存。
幸运的是,在Jackson的SequenceWriter和可选的管道streams的帮助下,这样做相当简单。 这是它的样子:
$title(pom.xml)
...
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-lang3</artifactId>
<version>3.6</version>
</dependency>
...
$title(Demo.java)
import com.fasterxml.jackson.databind.ObjectMapper;
import com.fasterxml.jackson.databind.ObjectWriter;
import com.fasterxml.jackson.databind.SequenceWriter;
import com.google.common.util.concurrent.ListenableFuture;
import org.springframework.scheduling.annotation.Async;
import java.io.PipedInputStream;
import java.io.PipedOutputStream;
import java.util.UUID;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.Future;
import java.util.concurrent.TimeUnit;
import java.util.logging.Logger;
import java.util.zip.GZIPOutputStream;
import org.apache.commons.lang.time.StopWatch;
import org.springframework.scheduling.annotation.AsyncResult;
public class Demo {
final static Logger logger=Logger.getLogger(Demo.class.getName());
private ObjectMapper jsonMapper = new ObjectMapper();
private ExecutorService executorService = Executors.newFixedThreadPool(5);
@Async
public ListenableFuture<Boolean> export(UUID customerId) {
try (PipedInputStream in = new PipedInputStream();
PipedOutputStream pipedOut = new PipedOutputStream(in);
GZIPOutputStream out = new GZIPOutputStream(pipedOut)) {
StopWatch stopwatch = StopWatch.createStarted();//common-lang3 写法
ObjectWriter writer = jsonMapper.writer().withDefaultPrettyPrinter();
try (SequenceWriter sequenceWriter = writer.writeValues(out)) {
sequenceWriter.init(true);
Future<?> storageFuture = executorService.submit(() ->
storageProvider.storeFile(getFilePath(customerId), in));
int batchCounter = 0;
while (true) {
List<Record> batch = readDatabaseBatch(batchCounter++);//批量读取数据,这里并未提供具体实现
for (Record record : batch) {
sequenceWriter.write(record);//record为数据对象
}
}
// 等待保存完毕
storageFuture.get();
}
logger.info("Exporting took {} seconds", stopWatch.stop().elapsed(TimeUnit.SECONDS));
return AsyncResult.forValue(true);
} catch (Exception ex) {
logger.error("Failed to export data", ex);
return AsyncResult.forValue(false);
}
}
}
上面的代码做了下面这些事情
主要工作是由Jackson的SequenceWriter完成的,而且(明显的)要点是 - 不要假设你的数据适合记忆。 它几乎从不这样想,所以批量和增量写入都是如此。
http://blog.xqlee.com/article/492.html