I'm querying bus stops from a database, and I wish to have it only return 1 stop per bus line/direction. This query does just that:
Stop.select("DISTINCT line_id, direction")
Except that it won't give me any other attribute than those 2. I tried a couple of other queries to have it return the id in addition to the line_id and direction fields (ideally it would return all columns), with no luck:
Stop.select("DISTINCT line_id, direction, id")
and
Stop.select("DISTINCT(line_id || '-' || direction), id")
In both cases, the query loses its distinct clause and all rows are returned.
Some awesome dude helped me out and suggested to use a 开发者_运维问答subquery to have it return all the ids:
Stop.find_by_sql("SELECT DISTINCT a1.line_id, a1.direction, (SELECT a2.id from stops a2 where a2.line_id = a1.line_id AND a2.direction = a1.direction ORDER BY a2.id ASC LIMIT 1) as id FROM stops a1
I can then extract all the ids and perform a 2nd query to fetch the full attributes for each stop.
Is there a way to have it all inside 1 query AND have it return all the attributes?
Stop.select("DISTINCT ON (line_id, direction) *")
Not so fast - The other answer selects stop_id arbitrary
This is why your question makes no sense. We can pull stop_ids and have distinct line_id and direction. But we have no idea why we have the stop_id we do.
    create temp table test( line_id integer, direction char(1), stop_id      integer);
    insert into test values
            (1, 'N', 1),
            (1, 'N', 2),
            (1, 'S', 1),
            (1, 'S', 2),
            (2, 'N', 1),
            (2, 'N', 2),
            (2, 'S', 1),
            (2, 'S', 2)
    ;
    select distinct on (line_id, direction) * from test;
    -- do this again but will reverse the order of stop_ids
    -- could it possible change our Robust Query?!!!
    drop table test;
    create temp table test(line_id integer,direction char(1),stop_id integer);
    insert into test values
            (1, 'N', 2),
            (1, 'N', 1),
            (1, 'S', 2),
            (1, 'S', 1),
            (2, 'N', 2),
            (2, 'N', 1),
            (2, 'S', 2),
            (2, 'S', 1)
    ;
    select distinct on (line_id, direction) * from test;
First select:
line_id | direction | stop_id 
---------+-----------+---------
       1 | N         |       1
       1 | S         |       1
       2 | N         |       1
       2 | S         |       1
Second select:
line_id | direction | stop_id 
---------+-----------+---------
       1 | N         |       2
       1 | S         |       2
       2 | N         |       2
       2 | S         |       2
So we got away without grouping stop_id but we have no guarantees why we got the one we did. All we know is that this is valid stop_id. Any updates, inserts, and other stuff that no RDMS will guarantee can be changing around the physical order of rows.
This is what I meant in the top comment. There is no known reason for pulling one stop_id over the other one, but somehow you need this stop_id (or whatever else) desperately.
 
         
                                         
                                         
                                         
                                        ![Interactive visualization of a graph in python [closed]](https://www.devze.com/res/2023/04-10/09/92d32fe8c0d22fb96bd6f6e8b7d1f457.gif) 
                                         
                                         
                                         
                                         加载中,请稍侯......
 加载中,请稍侯......
      
精彩评论