For each element in data1, I need to figure out what elements in data2 are related to it. Also, for each element in data2, I need to figure out what elements in data1 are related to it. Therefore, I setup a mutual data structure as you can see below:
01 class data1 {
02 // class variables
03 int id;
04 float d1_v1;
05 float d1_v2;
06 map<string, float> m_to_data2; // string is the keyword of data2, float recorded the reference info when data1 and data2 are related.
07 };
08
09 class data2 {
10 // class variables
11 int d2_v1;
float d2_v2;
12 list <int> src_id; // int is the id of data1
13 };
14
15 map<string, data2 *> map_2;
16 map<int, data1 *> map_1;
Then I parse file and fill the map_1 and map_2. I found:
the total memory usage after setting up the mutual-linked two maps: 498.7M.
without set the link from data2 to data1 (not fill
list <int> src_id), memory usage: 392.7M.Without fill map_1, without fill
list <int> src_idin data2, memory usage: 182.0MWithout fill map_1, fill
list <int> src_idwith ids of data1, memory usage: 289.7Mwithout fill map m_to_data2, memory usage: 290.0M
The size of map_1: 77737
The size of map_2: 1830009
The size of
map<string, float>m_to_data2 for each element of map_1 in the range of 3 - 17522The size of
list <int> s开发者_JS百科rc_idfor each element of map_2 in the range of 1- 1377
I need to reduce the memory usage after setting up the mutual-linked maps (ideally less than 200M, currently 498M as you can see above). I was trying to token the string (keyword of data2) to int by setting up an extra map <string, int>, since int needs less memory than string, but it may not help much since I need extra memory for the map <string, int>. Any suggestions?
Your comments/suggestions are highly appreciated.
I would start by the following:
- If possible (i.e. you have boost or tr1 available and ordering isn't important), change all
maps tounordered_maps. - If possible, change
listindata2tovector. - Have an
unordered_mapthat maps fromstringto anunsignedidand use thatideverywhere you're currently using strings. Given the size range of 3-17k form_to_data2, you're duplicating names a lot.
Looks like the float reference variable is taking up too much memory may be you can change it to a unsigned int.
If I understand your question correctly I would have rather used std::multimap instead of std::map
std::multimap<std::string, int> map2;
std::multimap<int, std::map<std::string, float> > map1;
that would make your code much easier to understand and proceed towards optimization.
加载中,请稍侯......
精彩评论