开发者

Which data structure to use for huge but constant dictionary in C++

开发者 https://www.devze.com 2023-04-05 07:17 出处:网络
I have to use a huge dictionary with integer (or enum) keys and string values. But this is totally constant. No way to change in runtime. Is there a way (using templates etc.) to retrieve dictionary d

I have to use a huge dictionary with integer (or enum) keys and string values. But this is totally constant. No way to change in runtime. Is there a way (using templates etc.) to retrieve dictionary data at compile time instead of using existing dictionary struc开发者_开发技巧ture?


Clang and LLVM have solved your issue by generating tables containing their objects, using a combination of code generation and preprocessor trickery.

You can skip either step, depending on your own setup. For example:

// records.inc
EXPAND_RECORD(Foo, "Foo", 4);
EXPAND_RECORD(Bar, "Bar", 18);
EXPAND_RECORD(Bar2, "Bar", 19);

Now, you can generate your enum:

// records.h
enum Record {

#define EXPAND_RECORD(Name, String, Value) Name,
#include "records.inc"
#undef EXPAND_RECORD

};

char const* getRecordName(Record r);
int getRecordValue(Record r);

// records.cpp

char const* getRecordName(Record r) {
  switch(r) {
#define EXPAND_RECORD(Name, String, Value) case Name: return String;
#include "records.inc"
#undef EXPAND_RECORD
  }

  abort(); // unreachable, or you can return a "null" value
}

int getRecordValue(Record r) {
  switch(r) {
#define EXPAND_RECORD(Name, String, Value) case Name: return Value;
#include "records.inc"
#undef EXPAND_RECORD
  }

  abort(); // unreachable, or you can return a "null" value
}

In Clang and LLVM, a code generation phase is used to generate the .inc from more pleasant definition files.

It works pretty well... but do be aware that any modification of the enum implies full recompilation. You might wish to go to a "codeset" approach, where the enum is used internally but never leaked outside, and stable values (those of the enum) are provided to the client (unsigned), so that old clients can link to the new libraries without recompilation: they will be limited to use the old set of codes, which is no problem if it's stable.


Surely you can simply use sed to transform the dictionary into a string constant indexed by template parameter, with a header file like:

template <int Index> struct Dictionary { static const char *entry; };

and a source file with many lines of the form:

template <> const char *Dictionary<5>::entry = "Entry for five";

On the other hand, do you really want to do this from a maintenance perspective? It entails recompilation for every changed dictionary entry and bloated executable sizes.


How about automatic code generation? Take the configuration file or database or whatever the source is and generate C++ header code from that. It could look something like this:

#define MYCONST_1 "#00ff00"
#define MYCONST_10 "My other configuration string"

You can do the conversion with a simple bash script or with ruby/python (or C++ if you are masochist), it depends on the complexity of your configuration file.

Then write some make rules to automatically create the header file(s) when the configuration file changes.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号