Java:当我的应用程序运行较长时间时出现内存不足错误
我有一个java应用程序,我在其中获取文件非常小的文件(1KB),但大量的小文件就像在一分钟内,即我在一分钟内获得20000个文件。 我正在把文件上传到S3 我在10个并行线程中运行这个。 我还必须持续运行这个应用程序 当此应用程序运行几天时,我会出现内存不足错误 这正是我得到的错误Java:当我的应用程序运行较长时间时出现内存不足错误,java,multithreading,amazon-s3,Java,Multithreading,Amazon S3,我有一个java应用程序,我在其中获取文件非常小的文件(1KB),但大量的小文件就像在一分钟内,即我在一分钟内获得20000个文件。 我正在把文件上传到S3 我在10个并行线程中运行这个。 我还必须持续运行这个应用程序 当此应用程序运行几天时,我会出现内存不足错误 这正是我得到的错误 # # There is insufficient memory for the Java Runtime Environment to continue. # Native memory allocation (
#
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (malloc) failed to allocate 347376 bytes for Chunk::new
# Possible reasons:
# The system is out of physical RAM or swap space
# In 32 bit mode, the process size limit was hit
# Possible solutions:
# Reduce memory load on the system
# Increase physical memory or swap space
# Check if swap backing store is full
# Use 64 bit Java on a 64 bit OS
# Decrease Java heap size (-Xmx/-Xms)
# Decrease number of Java threads
# Decrease Java thread stack sizes (-Xss)
# Set larger code cache with -XX:ReservedCodeCacheSize=
# This output file may be truncated or incomplete.
#
# Out of Memory Error (allocation.cpp:390), pid=6912, tid=0x000000000003ec8c
#
# JRE version: Java(TM) SE Runtime Environment (8.0_181-b13) (build 1.8.0_181-b13)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.181-b13 mixed mode windows-amd64 compressed oops)
# Core dump written. Default location: d:\S3FileUploaderApp\hs_err_pid6912.mdmp
#
这是我的java类。
我正在复制所有的类,以便进行调查
这是我的Java Visual VM报告映像
添加我的示例输出
更新元空间图像
这是我的主课
public class UploadExecutor {
private static Logger _logger = Logger.getLogger(UploadExecutor.class);
public static void main(String[] args) {
_logger.info("----------STARTING JAVA MAIN METHOD----------------- ");
/*
* 3 C:\\Users\\u6034690\\Desktop\\TWOFILE\\xml
* a205381-tr-fr-production-us-east-1-trf-auditabilty
*/
final int batchSize = 100;
while (true) {
String strNoOfThreads = args[0];
String strFileLocation = args[1];
String strBucketName = args[2];
int iNoOfThreads = Integer.parseInt(strNoOfThreads);
S3ClientManager s3ClientObj = new S3ClientManager();
AmazonS3Client s3Client = s3ClientObj.buildS3Client();
try {
FileProcessThreads fp = new FileProcessThreads();
File[] files = fp.getFiles(strFileLocation);
try {
_logger.info("No records found will wait for 10 Seconds");
TimeUnit.SECONDS.sleep(10);
files = fp.getFiles(strFileLocation);
ArrayList<File> batchFiles = new ArrayList<File>(batchSize);
if (null != files) {
for (File path : files) {
String fileType = FilenameUtils.getExtension(path.getName());
long fileSize = path.length();
if (fileType.equals("gz") && fileSize > 0) {
batchFiles.add(path);
}
if (batchFiles.size() == batchSize) {
BuildThread BuildThreadObj = new BuildThread();
BuildThreadObj.buildThreadLogic(iNoOfThreads, s3Client, batchFiles, strFileLocation,
strBucketName);
_logger.info("---Batch One got completed---");
batchFiles.clear();
}
}
}
// to consider remaining or files with count<batch size
if (!batchFiles.isEmpty()) {
BuildThread BuildThreadObj = new BuildThread();
BuildThreadObj.buildThreadLogic(iNoOfThreads, s3Client, batchFiles, strFileLocation,
strBucketName);
batchFiles.clear();
}
} catch (InterruptedException e) {
_logger.error("InterruptedException: " + e.toString());
}
} catch (Throwable t) {
_logger.error("InterruptedException: " + t.toString());
}
}
}
}
这是我获取文件的地方
public class FileProcessThreads {
public File[] getFiles(String fileLocation) {
File dir = new File(fileLocation);
File[] directoryListing = dir.listFiles();
if (directoryListing.length > 0)
return directoryListing;
return null;
}
}
您使用的是什么版本的Java,以及为垃圾收集器设置的参数是什么?最近,我遇到了一个问题,我们的Java8应用程序运行默认设置,随着时间的推移,它们会耗尽服务器所有可用的内存。我通过向每个应用程序添加以下参数来修复此问题:
-使应用程序使用G1垃圾收集器李>-XX:+UseG1GC
-将最小堆大小设置为32mb-Xms32M
-将最大堆大小设置为512mb-Xmx512M
-在升级堆时设置最小空闲堆比率-XX:MinHeapFreeRatio=20
-在缩小堆时设置最大可用堆比率-XX:MaxHeapFreeRatio=40
是我找到的帮助我解决问题的文章。它更详细地描述了我上面所说的一切。很抱歉,我没有回答关于内存泄漏的原始问题,但你的方法在我看来完全有缺陷。
UploadObject
上的System.exit()
调用可能是导致资源泄漏的原因,但这只是开始。AmazonS3TransferManager
已经有一个内部执行器服务,因此您不需要自己的多线程控制器。我看不出你怎么能同意每个文件只上传一次。您进行多次上传调用,然后删除所有文件,而不考虑上传过程中是否出现故障,因此文件不在S3中。您试图在执行者之间分发文件,这是不必要的。在TransferManager
ExecutorService
上添加更多线程不会提高您的性能,只会导致颠簸
我会采取不同的方法
首先是一个非常简单的主类,它只启动一个工作线程并等待它完成
public class S3Uploader {
public static void main(String[] args) throws Exception {
final String strNoOfThreads = args[0];
final String strFileLocation = args[1];
final String strBucketName = args[2];
// Maximum number of file names that are read into memory
final int maxFileQueueSize = 5000;
S3UploadWorkerThread worker = new S3UploadWorkerThread(strFileLocation, strBucketName, Integer.parseInt(strNoOfThreads), maxFileQueueSize);
worker.run();
System.out.println("Uploading files, press any key to stop.");
System.in.read();
// Gracefully halt the worker thread waiting for any ongoing uploads to finish
worker.finish();
// Exit the main thread only after the worker thread has terminated
worker.join();
}
}
工作线程将使用信号灯
来限制发送到TransferManager
的上载次数,使用自定义文件名队列FileEnqueue
不断从源目录读取文件,使用ProgressListener
跟踪每次上载的进度。如果循环从源目录中读取的文件不足,它将等待十秒钟并重试。甚至文件队列也可能是不必要的。只要列出工作线程的而循环中的文件就足够了
import java.io.File;
import java.util.Map;
import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.Semaphore;
import com.amazonaws.AmazonClientException;
import com.amazonaws.services.s3.AmazonS3Client;
import com.amazonaws.services.s3.transfer.TransferManager;
import com.amazonaws.services.s3.transfer.TransferManagerBuilder;
import com.amazonaws.services.s3.transfer.Upload;
public class S3UploadWorkerThread extends Thread {
private final String sourceDir;
private final String targetBucket;
private final int maxQueueSize;
private final AmazonS3Client s3Client;
private Semaphore uploadLimiter;
private boolean running;
public final long SLEEP_WHEN_NO_FILES_AVAILABLE_MS = 10000l; // 10 seconds
public S3UploadWorkerThread(final String sourceDir, final String targetBucket, final int maxConcurrentUploads, final int maxQueueSize) {
this.running = false;
this.sourceDir = sourceDir.endsWith(File.separator) ? sourceDir: sourceDir + File.separator;
this.targetBucket = targetBucket;
this.maxQueueSize = maxQueueSize;
this.s3Client = S3ClientManager.buildS3Client();
this.uploadLimiter = new Semaphore(maxConcurrentUploads);
}
public void finish() {
running = false;
}
@Override
public void run() {
running = true;
final Map<String, Upload> ongoingUploads = new ConcurrentHashMap<>();
final FileEnqueue queue = new FileEnqueue(sourceDir, maxQueueSize);
final TransferManager tm = TransferManagerBuilder.standard().withS3Client(s3Client).build();
while (running) {
// Get a file name from the in memory queue
final String fileName = queue.poll();
if (fileName!=null) {
try {
// Limit the number of concurrent uploads
uploadLimiter.acquire();
File fileObj = new File(sourceDir + fileName);
// Create an upload listener
UploadListener onComplete = new UploadListener(fileObj, queue, ongoingUploads, uploadLimiter);
try {
Upload up = tm.upload(targetBucket, fileName, fileObj);
up.addProgressListener(onComplete);
// ongoingUploads is used later to wait for ongoing uploads in case a finish() is requested
ongoingUploads.put(fileName, up);
} catch (AmazonClientException e) {
System.err.println("AmazonClientException " + e.getMessage());
}
} catch (InterruptedException e) {
e.printStackTrace();
}
} else {
// poll() returns null when the source directory is empty then wait for a number of seconds
try {
Thread.sleep(SLEEP_WHEN_NO_FILES_AVAILABLE_MS);
} catch (InterruptedException e) {
e.printStackTrace();
}
} // fi
} // wend
// Wait for ongoing uploads to finish before exiting ending the worker thread
for (Map.Entry<String,Upload> e : ongoingUploads.entrySet()) {
try {
e.getValue().waitForCompletion();
} catch (AmazonClientException | InterruptedException x) {
System.err.println(x.getClass().getName() + " at " + e.getKey());
}
} // next
tm.shutdownNow();
}
}
这是文件队列的bolierplate示例:
import java.io.File;
import java.io.FileFilter;
import java.util.concurrent.ConcurrentSkipListSet;
public class FileEnqueue {
private final String sourceDir;
private final ConcurrentSkipListSet<FileItem> seen;
private final ConcurrentSkipListSet<String> processing;
private final int maxSeenSize;
public FileEnqueue(final String sourceDirectory, int maxQueueSize) {
sourceDir = sourceDirectory;
maxSeenSize = maxQueueSize;
seen = new ConcurrentSkipListSet<FileItem>();
processing = new ConcurrentSkipListSet<>();
}
public synchronized String poll() {
if (seen.size()==0)
enqueueFiles();
FileItem fi = seen.pollFirst();
if (fi==null) {
return null;
} else {
processing.add(fi.getName());
return fi.getName();
}
}
public void done(final String fileName) {
processing.remove(fileName);
}
private void enqueueFiles() {
final FileFilter gzFilter = new GZFileFilter();
final File dir = new File(sourceDir);
if (!dir.exists() ) {
System.err.println("Directory " + sourceDir + " not found");
} else if (!dir.isDirectory() ) {
System.err.println(sourceDir + " is not a directory");
} else {
final File [] files = dir.listFiles(gzFilter);
if (files!=null) {
// How many more file names can we read in memory
final int spaceLeft = maxSeenSize - seen.size();
// How many new files will be read into memory
final int maxNewFiles = files.length<maxSeenSize ? files.length : spaceLeft;
for (int f=0, enqueued=0; f<files.length && enqueued<maxNewFiles; f++) {
File fl = files[f];
FileItem fi = new FileItem(fl);
// Do not put into the queue any file which has been already seen or is processing
if (!seen.contains(fi) && !processing.contains(fi.getName())) {
seen.add(fi);
enqueued++;
}
} // next
}
} // fi
}
private class GZFileFilter implements FileFilter {
@Override
public boolean accept(File f) {
final String fname = f.getName().toLowerCase();
return f.isFile() && fname.endsWith(".gz") && f.length()>0L;
}
}
}
更新2019年4月30日添加文件项类
import java.io.File;
import java.util.Comparator;
public class FileItem implements Comparable {
private final String name;
private final long dateSeen;
public FileItem(final File file) {
this.name = file.getName();
this.dateSeen = System.currentTimeMillis();
}
public String getName() {
return name;
}
public long getDateSeen() {
return dateSeen;
}
@Override
public int compareTo(Object otherObj) {
FileItem otherFileItem = (FileItem) otherObj;
if (getDateSeen()==otherFileItem.getDateSeen())
return getName().compareTo(otherFileItem.getName());
else if (getDateSeen()<otherFileItem.getDateSeen())
return -1;
else
return 1;
}
@Override
public boolean equals(Object otherFile) {
return getName().equals(((FileItem) otherFile).getName());
}
@Override
public int hashCode() {
return getName().hashCode();
}
public static final class CompareFileItems implements Comparator {
@Override
public int compare(Object fileItem1, Object fileItem2) {
return ((FileItem) fileItem1).compareTo(fileItem2);
}
}
}
导入java.io.File;
导入java.util.Comparator;
公共类FileItem实现了可比较的{
私有最终字符串名;
私人最终约会;
公共文件项(最终文件){
this.name=file.getName();
this.dateSeen=System.currentTimeMillis();
}
公共字符串getName(){
返回名称;
}
公共长getDateSeen(){
返回日期见;
}
@凌驾
公共整数比较对象(对象其他对象){
FileItem otherFileItem=(FileItem)otherObj;
如果(getDateSeen()==otherFileItem.getDateSeen())
返回getName().compareTo(otherFileItem.getName());
else if(getDateSeen()我无法在我的环境中运行您的代码,但我建议使用jmap创建内存转储。以下是此命令的语法:jmap-dump:format=b,file=heap.bin接下来,在Eclipse内存分析器中打开此内存转储,查看可疑对象。您甚至可以在一定时间间隔内创建两个转储并比较它们。通过这种方式,您将看到你在第二个转储中有更多的对象。你能上传你的gc日志并共享报告链接吗?我怀疑你的gc可能太频繁了。我现在没有gc日志,但我会尝试收集它。我的主方法中还有一个while循环,这会导致任何问题吗?Sudarshan,你能包括元空间的打印吗(堆选项卡右侧的选项卡)。内存可能会被元空间占用,因为堆看起来很好。据我所知,您不是在10个线程中运行,而是在10个线程中运行
import java.io.File;
import java.util.Map;
import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.Semaphore;
import com.amazonaws.AmazonClientException;
import com.amazonaws.services.s3.AmazonS3Client;
import com.amazonaws.services.s3.transfer.TransferManager;
import com.amazonaws.services.s3.transfer.TransferManagerBuilder;
import com.amazonaws.services.s3.transfer.Upload;
public class S3UploadWorkerThread extends Thread {
private final String sourceDir;
private final String targetBucket;
private final int maxQueueSize;
private final AmazonS3Client s3Client;
private Semaphore uploadLimiter;
private boolean running;
public final long SLEEP_WHEN_NO_FILES_AVAILABLE_MS = 10000l; // 10 seconds
public S3UploadWorkerThread(final String sourceDir, final String targetBucket, final int maxConcurrentUploads, final int maxQueueSize) {
this.running = false;
this.sourceDir = sourceDir.endsWith(File.separator) ? sourceDir: sourceDir + File.separator;
this.targetBucket = targetBucket;
this.maxQueueSize = maxQueueSize;
this.s3Client = S3ClientManager.buildS3Client();
this.uploadLimiter = new Semaphore(maxConcurrentUploads);
}
public void finish() {
running = false;
}
@Override
public void run() {
running = true;
final Map<String, Upload> ongoingUploads = new ConcurrentHashMap<>();
final FileEnqueue queue = new FileEnqueue(sourceDir, maxQueueSize);
final TransferManager tm = TransferManagerBuilder.standard().withS3Client(s3Client).build();
while (running) {
// Get a file name from the in memory queue
final String fileName = queue.poll();
if (fileName!=null) {
try {
// Limit the number of concurrent uploads
uploadLimiter.acquire();
File fileObj = new File(sourceDir + fileName);
// Create an upload listener
UploadListener onComplete = new UploadListener(fileObj, queue, ongoingUploads, uploadLimiter);
try {
Upload up = tm.upload(targetBucket, fileName, fileObj);
up.addProgressListener(onComplete);
// ongoingUploads is used later to wait for ongoing uploads in case a finish() is requested
ongoingUploads.put(fileName, up);
} catch (AmazonClientException e) {
System.err.println("AmazonClientException " + e.getMessage());
}
} catch (InterruptedException e) {
e.printStackTrace();
}
} else {
// poll() returns null when the source directory is empty then wait for a number of seconds
try {
Thread.sleep(SLEEP_WHEN_NO_FILES_AVAILABLE_MS);
} catch (InterruptedException e) {
e.printStackTrace();
}
} // fi
} // wend
// Wait for ongoing uploads to finish before exiting ending the worker thread
for (Map.Entry<String,Upload> e : ongoingUploads.entrySet()) {
try {
e.getValue().waitForCompletion();
} catch (AmazonClientException | InterruptedException x) {
System.err.println(x.getClass().getName() + " at " + e.getKey());
}
} // next
tm.shutdownNow();
}
}
import java.io.File;
import java.util.Map;
import java.util.concurrent.Semaphore;
import com.amazonaws.event.ProgressEvent;
import com.amazonaws.event.ProgressListener;
import com.amazonaws.services.s3.transfer.Upload;
public class UploadListener implements ProgressListener {
private final File fileObj;
private final FileEnqueue queue;
private final Map<String, Upload> ongoingUploads;
private final Semaphore uploadLimiter;
public UploadListener(File fileObj, FileEnqueue queue, Map<String, Upload> ongoingUploads, Semaphore uploadLimiter) {
this.fileObj = fileObj;
this.queue = queue;
this.ongoingUploads = ongoingUploads;
this.uploadLimiter = uploadLimiter;
}
@Override
public void progressChanged(ProgressEvent event) {
switch(event.getEventType()) {
case TRANSFER_STARTED_EVENT :
System.out.println("Started upload of file " + fileObj.getName());
break;
case TRANSFER_COMPLETED_EVENT:
/* Upon a successful upload:
* 1. Delete the file from disk
* 2. Notify the file name queue that the file is done
* 3. Remove it from the map of ongoing uploads
* 4. Release the semaphore permit
*/
fileObj.delete();
queue.done(fileObj.getName());
ongoingUploads.remove(fileObj.getName());
uploadLimiter.release();
System.out.println("Successfully finished upload of file " + fileObj.getName());
break;
case TRANSFER_FAILED_EVENT:
queue.done(fileObj.getName());
ongoingUploads.remove(fileObj.getName());
uploadLimiter.release();
System.err.println("Failed upload of file " + fileObj.getName());
break;
default:
// do nothing
}
}
}
import java.io.File;
import java.io.FileFilter;
import java.util.concurrent.ConcurrentSkipListSet;
public class FileEnqueue {
private final String sourceDir;
private final ConcurrentSkipListSet<FileItem> seen;
private final ConcurrentSkipListSet<String> processing;
private final int maxSeenSize;
public FileEnqueue(final String sourceDirectory, int maxQueueSize) {
sourceDir = sourceDirectory;
maxSeenSize = maxQueueSize;
seen = new ConcurrentSkipListSet<FileItem>();
processing = new ConcurrentSkipListSet<>();
}
public synchronized String poll() {
if (seen.size()==0)
enqueueFiles();
FileItem fi = seen.pollFirst();
if (fi==null) {
return null;
} else {
processing.add(fi.getName());
return fi.getName();
}
}
public void done(final String fileName) {
processing.remove(fileName);
}
private void enqueueFiles() {
final FileFilter gzFilter = new GZFileFilter();
final File dir = new File(sourceDir);
if (!dir.exists() ) {
System.err.println("Directory " + sourceDir + " not found");
} else if (!dir.isDirectory() ) {
System.err.println(sourceDir + " is not a directory");
} else {
final File [] files = dir.listFiles(gzFilter);
if (files!=null) {
// How many more file names can we read in memory
final int spaceLeft = maxSeenSize - seen.size();
// How many new files will be read into memory
final int maxNewFiles = files.length<maxSeenSize ? files.length : spaceLeft;
for (int f=0, enqueued=0; f<files.length && enqueued<maxNewFiles; f++) {
File fl = files[f];
FileItem fi = new FileItem(fl);
// Do not put into the queue any file which has been already seen or is processing
if (!seen.contains(fi) && !processing.contains(fi.getName())) {
seen.add(fi);
enqueued++;
}
} // next
}
} // fi
}
private class GZFileFilter implements FileFilter {
@Override
public boolean accept(File f) {
final String fname = f.getName().toLowerCase();
return f.isFile() && fname.endsWith(".gz") && f.length()>0L;
}
}
}
import com.amazonaws.auth.AWSCredentials;
import com.amazonaws.auth.AWSStaticCredentialsProvider;
import com.amazonaws.auth.profile.ProfileCredentialsProvider;
import com.amazonaws.services.s3.AmazonS3Client;
import com.amazonaws.services.s3.AmazonS3ClientBuilder;
public class S3ClientManager {
public static AmazonS3Client buildS3Client() {
AWSCredentials credential = new ProfileCredentialsProvider("TRFAuditability-Prod-ServiceUser").getCredentials();
AmazonS3Client s3Client = (AmazonS3Client) AmazonS3ClientBuilder.standard().withRegion("us-east-1")
.withCredentials(new AWSStaticCredentialsProvider(credential)).withForceGlobalBucketAccessEnabled(true)
.build();
s3Client.getClientConfiguration().setMaxConnections(5000);
s3Client.getClientConfiguration().setConnectionTimeout(6000);
s3Client.getClientConfiguration().setSocketTimeout(30000);
return s3Client;
}
}
import java.io.File;
import java.util.Comparator;
public class FileItem implements Comparable {
private final String name;
private final long dateSeen;
public FileItem(final File file) {
this.name = file.getName();
this.dateSeen = System.currentTimeMillis();
}
public String getName() {
return name;
}
public long getDateSeen() {
return dateSeen;
}
@Override
public int compareTo(Object otherObj) {
FileItem otherFileItem = (FileItem) otherObj;
if (getDateSeen()==otherFileItem.getDateSeen())
return getName().compareTo(otherFileItem.getName());
else if (getDateSeen()<otherFileItem.getDateSeen())
return -1;
else
return 1;
}
@Override
public boolean equals(Object otherFile) {
return getName().equals(((FileItem) otherFile).getName());
}
@Override
public int hashCode() {
return getName().hashCode();
}
public static final class CompareFileItems implements Comparator {
@Override
public int compare(Object fileItem1, Object fileItem2) {
return ((FileItem) fileItem1).compareTo(fileItem2);
}
}
}