开发者

Using a concurrent hashmap to reduce memory usage with threadpool?

开发者 https://www.devze.com 2023-04-01 11:03 出处:网络
I\'m working with a program that runs lengthy SQL queries and stores the processed results in a HashMap. Currently, to get around the slow execution time of each of the 20-200 queries, I am using a fi

I'm working with a program that runs lengthy SQL queries and stores the processed results in a HashMap. Currently, to get around the slow execution time of each of the 20-200 queries, I am using a fixed thread pool and a custom callable to do the searching. As a result, each callable is creating a local copy of the data which it then returns to the main program to be included in the report.

I've noticed that 100 query reports, which used to run without issue, now cause me to run out of memory. My speculation is that because these callables are creating their own copy of the data, I'm doubling memory usage when I join them into another large HashMap. I realize I could try to coax the garbage collector to run by attempting to reduce the scope of the callable's table, but that level of restructuring is not really what I want to do if it's possible to avoid.

Could I improve memory usage by replacing the callables with runnables that instead of storing the data, write it to a concurrent HashMap? Or d开发者_如何转开发oes it sound like I have some other problem here?


Don't create copy of data, just pass references around, ensuring thread safety if needed. If without data copying you still have OOM, consider increasing max available heap for application.

Drawback of above approach not using copy of data is that thread safety is harder to achieve, though.


Do you really need all 100-200 reports at the same time?

May be it's worth to limit the 1st level of caching by just 50 reports and introduce a 2nd level based on WeakHashMap? When 1st level exceeds its size LRU will be pushed to the 2nd level which will depend on the amount of available memory (with use of WeakHashMap).

Then to search for reports you will first need to query 1st level, if value is not there query 2nd level and if value is not there then report was reclaimed by GC when there was not enough memory and you have to query DB again for this report.


Do the results of the queries depend on other query results? If not, whenever you discover the results in another thread, just use a ConcurrentHashMap like you are implying. Do you really need to ask if creating several unnecessary copies of data is causing your program to run out of memory? This should almost be obvious.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号