If NoSQL is schemless how can it be good for massive updates?_问答_开发者

If NoSQL is schemless how can it be good for massive updates?

开发者 https://www.devze.com 2023-04-08 22:48 出处：网络

I was thinking in a schema for a problem where \"people\" got a country, in RDBMS i can make two tables, one for people and other for country and make a key to connect the two... if i want to change t

相关专题：nosql

I was thinking in a schema for a problem where "people" got a country, in RDBMS i can make two tables, one for people and other for country and make a key to connect the two... if i want to change the name of a country then i just change it in "country" table and and people got this new value... but i was thinking that if i make this in a key/document like MongoDB i will just need one document for "people" that inside it will have the value of country, like:

{name:"Tiago",
Country: "Brazil"}

Now, if i want to change all "Brazil" to "BraSil", i will have to search fo开发者_StackOverflow中文版r ALL people where country equals "Brazil" and then update? so, it will not be more slow than RDBMS???

There's no one answer; the technique will depend on the engine; As you said, "NoSQL" databases are not all created equal.

In CouchDB; you (usually) access all of your data through views, and you could easily change something like this in a view; The up-side is that since views are plain-old JavaScript, you can express just about any kind of transformation quite easily. The down side is that all views in CouchDB are basically the same idea as materialized views in SQL databases; and the first time you try to access such a value, the response will take as long as is neccessary to rebuild the whole view; if that's a lot of data to be rewritten, that'll take a while.

In AppScale or GoogleAppEngine, you'd have to change the entities directly; The only really effective way to systematically touch a possibly large set of data is to use a task queue, and you'd systematically query for entities that match "Brazil" until it didn't return any rows. There'd be a time when real queries return a mixture of both values.

In all cases; The strength of distributed document databases comes with a kind assumption that some kinds of operation are computationally intensive; but the engine itself is designed to still be available even while these kinds of operations are under way. This is largely a consequence of the trade offs that databases make to satisfy cap theorem.