Sorting on database server or application server in n-tier architecture [closed]_问答_开发者

Sorting on database server or application server in n-tier architecture [closed]

开发者 https://www.devze.com 2023-03-02 04:06 出处：网络

Closed. This question is opinion-based. It is not currently accepting answers. Want to improve this question? Update the question so it can be answered with facts and citations by editing

Closed. This question is opinion-based. It is not currently accepting answers.

Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.

Closed 6 years ago.

Improve this question

Suppose I am developing an application with a single database server and multiple application servers, where it is cheap and easy to add application servers but difficult to scale the database. Suppose I want to retrieve some information from the database that 开发者_开发百科needs to be sorted. All else being equal, it seems like I should prefer to sort on the application servers, since that shifts the load away from the database, which is hard to scale.

Now there are certainly some cases in which sorting on the database server is a no-brainer:

Sorting is necessary in order to obtain the correct result set. For example, if I want the top N according to some criterion, I obviously have to sort before I even know which rows I want. Sorting on the application server isn't an option here (unless I'm willing to suck down the entire table, which is typically not what I want to do).
There is an index that supports my sort order. In this case sorting on the database server is essentially free.

But other than that, am I generally correct to prefer sorting on the application server? Are there some cases I should consider in addition to those listed above?

My instinct is to sort the data on the database server, as that is one of it's prime functions and it is likely extremely efficient at it. The danger however is that the data may get resorted anyway at the client level, thus wasting processes.

If you have a database server that is so stressed it can no longer sort data quickly, you have bigger problems.

If the majority of the queries running on a server have been optimized, if the schema is rational, and the indexes are in place, a database server can do an enormous amount of work without even breaking a sweat.

I'll supplement Jaimal's comment with my own experience using the PostgreSQL DBMS. If you have a large shared buffer pool and you can prepare your statements which you are concerned about sort performance, you get a high-performance cache "for free" from your DBMS. If your queries cannot be prepared but you can limit the attributes you need in the result set, you can make an index on those attributes with your sort predicate. If you cannot perform any of these optimizations on the back end, then sorting in the application server will work well.

Regarding performance differences between sorting in an application and in the DBMS, I would expect the application language to have some overhead depending on it's object model. For example, I would expect sorting 1000000 Ruby objects versus 1000000 PostgreSQL tuples would show that the database is faster.

I believe that you are right. In the absence of an index the database has no performance advantage over sorting on your application server. In fact, on your app server you have control over which sorting algorithm you use, so in principle you could use something like radix sort (O(n) time) rather than quicksort if it applies to your case.

If your data doesn't change to often (you're willing to cache data) and you have a limited number of possible result sets, then you could sort on the database, but cache the result set or cache an array of keys for the result set saving having to do always perform the same sort of the same data.