madlib-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] [madlib] fmcquillan99 commented on issue #459: DL: Add support for asymmetric segment distribution to preprocessor
Date Wed, 27 Nov 2019 03:58:03 GMT
fmcquillan99 commented on issue #459: DL: Add support for asymmetric segment distribution to
preprocessor
URL: https://github.com/apache/madlib/pull/459#issuecomment-558917862
 
 
   It looks like it's almost there, the last thing to fix is:
   ```
   madlib=# SELECT * FROM segments_to_use ORDER BY hostname, dbid;
    dbid |       hostname        
   ------+-----------------------
       2 | pm-demo-machine-keras
       2 | pm-demo-machine-keras
       2 | pm-demo-machine-keras
       2 | pm-demo-machine-keras
       2 | pm-demo-machine-keras
       3 | pm-demo-machine-keras
   (6 rows)
   ```
   produces
   ```
   madlib=# DROP TABLE IF EXISTS image_data_packed, image_data_packed_summary;           
                                                             DROP TABLE
   Time: 22.951 ms
   madlib=# SELECT madlib.training_preprocessor_dl('image_data',         -- Source table 
                                                                                         
           'image_data_packed',  -- Output table                                         
                                                                     'species',          
 -- Dependent variable                                                                   
                                     'rgb',                -- Independent variable       
                                                                                         
     NULL,                 -- Buffer size                                                
                                                               255,                   -- Normalizing
constant                                                                                 
                    NULL,                                                                
                                                                              'segments_to_use'
                                                                                         
                                        );
   -[ RECORD 1 ]------------+-
   training_preprocessor_dl | 
   
   Time: 2355.189 ms
   madlib=# 
   madlib=# 
   madlib=# SELECT * FROM image_data_packed_summary;
   -[ RECORD 1 ]-----------+------------------
   source_table            | image_data
   output_table            | image_data_packed
   dependent_varname       | species
   independent_varname     | rgb
   dependent_vartype       | text
   class_values            | {bird,cat,dog}
   buffer_size             | 26
   normalizing_const       | 255
   num_classes             | 3
   distribution_rules      | {2,2,2,2,2,3}
   __internal_gpu_config__ | {0,0,0,0,0,1}
   ```
   madlib=# DROP TABLE IF EXISTS image_data_packed, image_data_packed_summary;           
                                                             DROP TABLE
   Time: 22.951 ms
   madlib=# SELECT madlib.training_preprocessor_dl('image_data',         -- Source table 
                                                                                         
           'image_data_packed',  -- Output table                                         
                                                                     'species',          
 -- Dependent variable                                                                   
                                     'rgb',                -- Independent variable       
                                                                                         
     NULL,                 -- Buffer size                                                
                                                               255,                   -- Normalizing
constant                                                                                 
                    NULL,                                                                
                                                                              'segments_to_use'
                                                                                         
                                        );
   -[ RECORD 1 ]------------+-
   training_preprocessor_dl | 
   
   Time: 2355.189 ms
   madlib=# 
   madlib=# 
   madlib=# SELECT * FROM image_data_packed_summary;
   -[ RECORD 1 ]-----------+------------------
   source_table            | image_data
   output_table            | image_data_packed
   dependent_varname       | species
   independent_varname     | rgb
   dependent_vartype       | text
   class_values            | {bird,cat,dog}
   buffer_size             | 26
   normalizing_const       | 255
   num_classes             | 3
   distribution_rules      | {2,2,2,2,2,3}
   __internal_gpu_config__ | {0,0,0,0,0,1}
   ```
   I'd suggest throwing an error if there is a duplicate row in the distribution table, rather
than passing it through like this.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

Mime
View raw message