hawq-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hubert Zhang <hzh...@pivotal.io>
Subject Re: Re: External scan error: There are more external files (URLs) than primary segments that can read them (COptTasks.cpp:1756)
Date Fri, 23 Sep 2016 00:01:05 GMT
+1 with Vineet's comment.
in your statement "insert into call_center select * from ext_call_center;"
There are two strict restriction of how many virtual segments need to be
started:
1 the bucket number of hash distributed result relation call_center (must
equal to #vseg)
2 the number of locations in  gpfdist ext table ext_call_center (<= #vseg)

You must set the the bucket number of call_center bigger than the the
number of locations in ext_call_center

Thanks
Hubert


On Thu, Sep 22, 2016 at 1:19 AM, Luis Macedo <lmacedo@pivotal.io> wrote:

> Also on your location clause you should not reference the same file more
> than one time.
>
> If you want to scale gpfdist process you need to use different range on
> each port for a same server. (Not sure if I explain myself :))
>
> If you use one gpfdist per server performance should be fine. One gpfdist
> process usually can do 250MB/s if underlying infra allows.
>
> Rgds
>
> --- Sent from my Nexus 5x
>
> Em 21 de set de 2016 12:14 PM, "Vineet Goel" <vvineet@apache.org>
> escreveu:
>
>> Your default_hash_table_bucket_number value is set to 6. Typically, this
>> should be adjusted as 6 x #_of_your_segment_hosts. With 3 segments, you
>> should set this value to 18. Any time you change this parameter, you should
>> redistribute your HASH distributed tables, if you have any (unless the
>> table DDL has # of buckets defined, I think).
>>
>> Increase default_hash_table_bucket_number to 18 and retry the insert.
>> Since you have 18 ext table URLs, it should work with the change.
>>
>> Thanks
>> Vineet
>>
>>
>> On Wed, Sep 21, 2016 at 12:26 AM 来熊 <yin.zhb@163.com> wrote:
>>
>>>
>>> My environment is >>>> : 1 master 3 segments
>>> SQL >>>>>>:
>>> CREATE TABLE call_center (
>>>     cc_call_center_sk integer,
>>>     cc_call_center_id character varying(16),
>>>     cc_rec_start_date date,
>>>     cc_rec_end_date date,
>>>     cc_closed_date_sk integer,
>>>     cc_open_date_sk integer,
>>>     cc_name character varying(50),
>>>     cc_class character varying(50),
>>>     cc_employees integer,
>>>     cc_sq_ft integer,
>>>     cc_hours character varying(20),
>>>     cc_manager character varying(40),
>>>     cc_mkt_id integer,
>>>     cc_mkt_class character varying(50),
>>>     cc_mkt_desc character varying(100),
>>>     cc_market_manager character varying(40),
>>>     cc_division text,
>>>     cc_division_name character varying(50),
>>>     cc_company text,
>>>     cc_company_name character varying(50),
>>>     cc_street_number character varying(10),
>>>     cc_street_name character varying(60),
>>>     cc_street_type character varying(15),
>>>     cc_suite_number character varying(10),
>>>     cc_city character varying(60),
>>>     cc_county character varying(30),
>>>     cc_state text,
>>>     cc_zip character varying(10),
>>>     cc_country character varying(20),
>>>     cc_gmt_offset numeric(5,2),
>>>     cc_tax_percentage numeric(5,2)
>>> )
>>> WITH (appendonly=true, orientation=parquet)
>>> DISTRIBUTED BY (cc_call_center_sk);
>>>
>>> CREATE EXTERNAL TABLE ext_call_center (like call_center)
>>> LOCATION (
>>> 'gpfdist://segment3:9001/call_center_[0-9]*_[0-9]*.dat',
>>> 'gpfdist://segment3:9002/call_center_[0-9]*_[0-9]*.dat',
>>> 'gpfdist://segment3:9003/call_center_[0-9]*_[0-9]*.dat',
>>> 'gpfdist://segment3:9004/call_center_[0-9]*_[0-9]*.dat',
>>> 'gpfdist://segment3:9005/call_center_[0-9]*_[0-9]*.dat',
>>> 'gpfdist://segment3:9006/call_center_[0-9]*_[0-9]*.dat',
>>> 'gpfdist://segment2:9001/call_center_[0-9]*_[0-9]*.dat',
>>> 'gpfdist://segment2:9002/call_center_[0-9]*_[0-9]*.dat',
>>> 'gpfdist://segment2:9003/call_center_[0-9]*_[0-9]*.dat',
>>> 'gpfdist://segment2:9004/call_center_[0-9]*_[0-9]*.dat',
>>> 'gpfdist://segment2:9005/call_center_[0-9]*_[0-9]*.dat',
>>> 'gpfdist://segment2:9006/call_center_[0-9]*_[0-9]*.dat',
>>> 'gpfdist://segment1:9001/call_center_[0-9]*_[0-9]*.dat',
>>> 'gpfdist://segment1:9002/call_center_[0-9]*_[0-9]*.dat',
>>> 'gpfdist://segment1:9003/call_center_[0-9]*_[0-9]*.dat',
>>> 'gpfdist://segment1:9004/call_center_[0-9]*_[0-9]*.dat',
>>> 'gpfdist://segment1:9005/call_center_[0-9]*_[0-9]*.dat',
>>> 'gpfdist://segment1:9006/call_center_[0-9]*_[0-9]*.dat')
>>> FORMAT 'TEXT' (DELIMITER '|' NULL AS '' ESCAPE AS E'\\');
>>>
>>> insert into call_center select * from ext_call_center;
>>>
>>> ERROR:  External scan error: There are more external files (URLs) than
>>> primary segments that can read them (COptTasks.cpp:1756)
>>>
>>> hawq config >>>>:
>>>
>>> GUC : add_missing_from Value : off
>>> GUC : application_name Value :
>>> GUC : array_nulls Value : on
>>> GUC : authentication_timeout Value : 60
>>> GUC : backslash_quote Value : safe_encoding
>>> GUC : block_size Value : 32768
>>> GUC : bonjour_name Value :
>>> GUC : check_function_bodies Value : on
>>> GUC : client_encoding Value : UTF8
>>> GUC : client_min_messages Value : ERROR
>>> GUC : cpu_index_tuple_cost Value : 0.005
>>> GUC : cpu_operator_cost Value : 0.0025
>>> GUC : cpu_tuple_cost Value : 0.01
>>> GUC : cursor_tuple_fraction Value : 1
>>> GUC : custom_variable_classes Value :
>>> GUC : DateStyle Value : ISO, MDY
>>> GUC : db_user_namespace Value : off
>>> GUC : deadlock_timeout Value : 1000
>>> GUC : debug_assertions Value : off
>>> GUC : debug_pretty_print Value : off
>>> GUC : debug_print_parse Value : off
>>> GUC : debug_print_plan Value : off
>>> GUC : debug_print_prelim_plan Value : off
>>> GUC : debug_print_rewritten Value : off
>>> GUC : debug_print_slice_table Value : off
>>> GUC : default_hash_table_bucket_number Value : 6
>>> GUC : default_statement_mem Value : 128000
>>> GUC : default_statistics_target Value : 25
>>> GUC : default_tablespace Value :
>>> GUC : default_transaction_isolation Value : read committed
>>> GUC : default_transaction_read_only Value : off
>>> GUC : dfs_url Value : localhost:8020/hawq
>>> GUC : dynamic_library_path Value : $libdir
>>> GUC : effective_cache_size Value : 16384
>>> GUC : enable_bitmapscan Value : on
>>> GUC : enable_groupagg Value : on
>>> GUC : enable_hashagg Value : on
>>> GUC : enable_hashjoin Value : on
>>> GUC : enable_indexscan Value : on
>>> GUC : enable_mergejoin Value : off
>>> GUC : enable_nestloop Value : off
>>> GUC : enable_secure_filesystem Value : off
>>> GUC : enable_seqscan Value : on
>>> GUC : enable_sort Value : on
>>> GUC : enable_tidscan Value : on
>>> GUC : escape_string_warning Value : on
>>> GUC : explain_memory_verbosity Value : suppress
>>> GUC : explain_pretty_print Value : on
>>> GUC : extra_float_digits Value : 0
>>> GUC : filesystem_support_truncate Value : on
>>> GUC : from_collapse_limit Value : 20
>>> GUC : gp_adjust_selectivity_for_outerjoins Value : on
>>> GUC : gp_analyze_relative_error Value : 0.25
>>> GUC : gp_autostats_mode Value : NONE
>>> GUC : gp_autostats_on_change_threshold Value : 2147483647
>>> GUC : gp_backup_directIO Value : off
>>> GUC : gp_backup_directIO_read_chunk_mb Value : 20
>>> GUC : gp_cached_segworkers_threshold Value : 5
>>> GUC : gp_command_count Value : 6
>>> GUC : gp_connections_per_thread Value : 64
>>> GUC : gp_debug_linger Value : 0
>>> GUC : gp_dynamic_partition_pruning Value : on
>>> GUC : gp_email_connect_avoid_duration Value : 7200
>>> GUC : gp_email_connect_failures Value : 5
>>> GUC : gp_email_connect_timeout Value : 15
>>> GUC : gp_email_from Value :
>>> GUC : gp_email_smtp_password Value :
>>> GUC : gp_email_smtp_server Value : localhost:25
>>> GUC : gp_email_smtp_userid Value :
>>> GUC : gp_email_to Value :
>>> GUC : gp_enable_agg_distinct Value : on
>>> GUC : gp_enable_agg_distinct_pruning Value : on
>>> GUC : gp_enable_direct_dispatch Value : on
>>> GUC : gp_enable_fallback_plan Value : on
>>> GUC : gp_enable_fast_sri Value : on
>>> GUC : gp_enable_gpperfmon Value : off
>>> GUC : gp_enable_groupext_distinct_gather Value : on
>>> GUC : gp_enable_groupext_distinct_pruning Value : on
>>> GUC : gp_enable_multiphase_agg Value : on
>>> GUC : gp_enable_predicate_propagation Value : on
>>> GUC : gp_enable_preunique Value : on
>>> GUC : gp_enable_sequential_window_plans Value : on
>>> GUC : gp_enable_sort_distinct Value : on
>>> GUC : gp_enable_sort_limit Value : on
>>> GUC : gp_external_enable_exec Value : on
>>> GUC : gp_external_grant_privileges Value : off
>>> GUC : gp_external_max_segs Value : 64
>>> GUC : gp_filerep_tcp_keepalives_count Value : 2
>>> GUC : gp_filerep_tcp_keepalives_idle Value : 60
>>> GUC : gp_filerep_tcp_keepalives_interval Value : 30
>>> GUC : gp_force_use_default_temporary_directory Value : off
>>> GUC : gp_gpperfmon_send_interval Value : 1
>>> GUC : gp_hashjoin_tuples_per_bucket Value : 5
>>> GUC : gp_idf_deduplicate Value : auto
>>> GUC : gp_interconnect_cache_future_packets Value : on
>>> GUC : gp_interconnect_default_rtt Value : 20
>>> GUC : gp_interconnect_fc_method Value : LOSS
>>> GUC : gp_interconnect_hash_multiplier Value : 2
>>> GUC : gp_interconnect_min_retries_before_timeout Value : 100
>>> GUC : gp_interconnect_min_rto Value : 20
>>> GUC : gp_interconnect_queue_depth Value : 4
>>> GUC : gp_interconnect_setup_timeout Value : 7200
>>> GUC : gp_interconnect_snd_queue_depth Value : 2
>>> GUC : gp_interconnect_timer_checking_period Value : 20
>>> GUC : gp_interconnect_timer_period Value : 5
>>> GUC : gp_interconnect_transmit_timeout Value : 3600
>>> GUC : gp_interconnect_type Value : UDP
>>> GUC : gp_log_format Value : csv
>>> GUC : gp_max_csv_line_length Value : 1048576
>>> GUC : gp_max_databases Value : 16
>>> GUC : gp_max_filespaces Value : 8
>>> GUC : gp_max_packet_size Value : 8192
>>> GUC : gp_max_partition_level Value : 0
>>> GUC : gp_max_plan_size Value : 0
>>> GUC : gp_max_relations Value : 65536
>>> GUC : gp_max_tablespaces Value : 16
>>> GUC : gp_motion_cost_per_row Value : 0
>>> GUC : gp_num_contents_in_cluster Value : -10000
>>> GUC : gp_query_context_mem_limit Value : 102400
>>> GUC : gp_reject_percent_threshold Value : 300
>>> GUC : gp_reraise_signal Value : on
>>> GUC : gp_role Value : utility
>>> GUC : gp_safefswritesize Value : 0
>>> GUC : gp_segment_connect_timeout Value : 600
>>> GUC : gp_segments_for_planner Value : 0
>>> GUC : gp_session_id Value : -1
>>> GUC : gp_set_proc_affinity Value : off
>>> GUC : gp_snmp_community Value : public
>>> GUC : gp_snmp_monitor_address Value :
>>> GUC : gp_snmp_use_inform_or_trap Value : trap
>>> GUC : gp_statistics_pullup_from_child_partition Value : on
>>> GUC : gp_statistics_use_fkeys Value : on
>>> GUC : gp_subtrans_warn_limit Value : 16777216
>>> GUC : gp_temporary_directory_mark_error Value : 0
>>> GUC : gp_udp_bufsize_k Value : 0
>>> GUC : gp_vmem_protect_segworker_cache_limit Value : 500
>>> GUC : gp_workfile_checksumming Value : on
>>> GUC : gp_workfile_compress_algorithm Value : none
>>> GUC : gp_workfile_limit_per_query Value : 0
>>> GUC : gp_workfile_limit_per_segment Value : 0
>>> GUC : gpperfmon_port Value : 8888
>>> GUC : hawq_global_rm_type Value : yarn
>>> GUC : hawq_master_address_host Value : master.bigdata
>>> GUC : hawq_master_address_port Value : 6432
>>> GUC : hawq_master_temp_directory Value : /tmp
>>> GUC : hawq_metadata_cache_block_capacity Value : 2097152
>>> GUC : hawq_metadata_cache_check_interval Value : 30
>>> GUC : hawq_metadata_cache_flush_ratio Value : 0.85
>>> GUC : hawq_metadata_cache_free_block_max_ratio Value : 0.05
>>> GUC : hawq_metadata_cache_free_block_normal_ratio Value : 0.2
>>> GUC : hawq_metadata_cache_max_hdfs_file_num Value : 524288
>>> GUC : hawq_metadata_cache_reduce_ratio Value : 0.7
>>> GUC : hawq_metadata_cache_refresh_interval Value : 3600
>>> GUC : hawq_metadata_cache_refresh_max_num Value : 1000
>>> GUC : hawq_metadata_cache_refresh_timeout Value : 3600
>>> GUC : hawq_re_cgroup_hierarchy_name Value : hadoop-yarn
>>> GUC : hawq_re_cgroup_mount_point Value : /sys/fs/cgroup
>>> GUC : hawq_re_cpu_enable Value : off
>>> GUC : hawq_re_cpu_weight Value : 1024
>>> GUC : hawq_re_memory_overcommit_max Value : 8192
>>> GUC : hawq_re_vcore_pcore_ratio Value : 1
>>> GUC : hawq_rm_cluster_report_period Value : 60
>>> GUC : hawq_rm_clusterratio_core_to_memorygb_factor Value : 5
>>> GUC : hawq_rm_connpool_sameaddr_buffersize Value : 2
>>> GUC : hawq_rm_container_batch_limit Value : 1000
>>> GUC : hawq_rm_enable_connpool Value : on
>>> GUC : hawq_rm_force_alterqueue_cancel_queued_request Value : on
>>> GUC : hawq_rm_force_fifo_queuing Value : on
>>> GUC : hawq_rm_master_domain_port Value : 5436
>>> GUC : hawq_rm_master_port Value : 5437
>>> GUC : hawq_rm_memory_limit_perseg Value : 480GB
>>> GUC : hawq_rm_min_resource_perseg Value : 2
>>> GUC : hawq_rm_nocluster_timeout Value : 60
>>> GUC : hawq_rm_nresqueue_limit Value : 128
>>> GUC : hawq_rm_nslice_perseg_limit Value : 5000
>>> GUC : hawq_rm_nvcore_limit_perseg Value : 16
>>> GUC : hawq_rm_nvseg_for_analyze_nopart_perquery_limit Value : 512
>>> GUC : hawq_rm_nvseg_for_analyze_nopart_perquery_perseg_limit Value : 8
>>> GUC : hawq_rm_nvseg_for_analyze_part_perquery_limit Value : 256
>>> GUC : hawq_rm_nvseg_for_analyze_part_perquery_perseg_limit Value : 4
>>> GUC : hawq_rm_nvseg_for_copy_from_perquery Value : 6
>>> GUC : hawq_rm_nvseg_perquery_limit Value : 512
>>> GUC : hawq_rm_nvseg_perquery_perseg_limit Value : 6
>>> GUC : hawq_rm_nvseg_variance_amon_seg_limit Value : 1
>>> GUC : hawq_rm_nvseg_variance_amon_seg_respool_limit Value : 2
>>> GUC : hawq_rm_regularize_io_factor Value : 1
>>> GUC : hawq_rm_regularize_io_max Value : 1.37439e+11
>>> GUC : hawq_rm_regularize_nvseg_factor Value : 1
>>> GUC : hawq_rm_regularize_nvseg_max Value : 300
>>> GUC : hawq_rm_regularize_usage_factor Value : 1
>>> GUC : hawq_rm_rejectrequest_nseg_limit Value : 0.25
>>> GUC : hawq_rm_request_timeoutcheck_interval Value : 1
>>> GUC : hawq_rm_resource_allocation_timeout Value : 600
>>> GUC : hawq_rm_resource_idle_timeout Value : 300
>>> GUC : hawq_rm_respool_test_file Value :
>>> GUC : hawq_rm_return_percent_on_overcommit Value : 10
>>> GUC : hawq_rm_segment_config_refresh_interval Value : 30
>>> GUC : hawq_rm_segment_heartbeat_interval Value : 30
>>> GUC : hawq_rm_segment_heartbeat_timeout Value : 300
>>> GUC : hawq_rm_segment_port Value : 5438
>>> GUC : hawq_rm_segment_tmpdir_detect_interval Value : 300
>>> GUC : hawq_rm_session_lease_heartbeat_enable Value : on
>>> GUC : hawq_rm_session_lease_heartbeat_interval Value : 10
>>> GUC : hawq_rm_session_lease_timeout Value : 180
>>> GUC : hawq_rm_stmt_nvseg Value : 0
>>> GUC : hawq_rm_stmt_vseg_memory Value : 128mb
>>> GUC : hawq_rm_tolerate_nseg_limit Value : 0.25
>>> GUC : hawq_rm_yarn_address Value : worker1.bigdata:8050
>>> GUC : hawq_rm_yarn_app_name Value : hawq
>>> GUC : hawq_rm_yarn_queue_name Value : default
>>> GUC : hawq_rm_yarn_scheduler_address Value : worker1.bigdata:8030
>>> GUC : hawq_segment_address_port Value : 40000
>>> GUC : hawq_segment_history_keep_period Value : 365
>>> GUC : hawq_segment_temp_directory Value : /tmp
>>> GUC : integer_datetimes Value : on
>>> GUC : IntervalStyle Value : postgres
>>> GUC : join_collapse_limit Value : 20
>>> GUC : krb5_ccname Value : /tmp/postgres.ccname
>>> GUC : krb_caseins_users Value : off
>>> GUC : krb_server_keyfile Value : FILE:/data/pulse2-agent/agents
>>> /agent1/work/HAWQ-main-opt/rhel5_x86_64/src/hawq-db-dist/etc/krb5.keytab
>>> GUC : krb_srvname Value : postgres
>>> GUC : lc_collate Value : C
>>> GUC : lc_ctype Value : C
>>> GUC : lc_messages Value : en_US.utf8
>>> GUC : lc_monetary Value : en_US.utf8
>>> GUC : lc_numeric Value : en_US.utf8
>>> GUC : lc_time Value : en_US.utf8
>>> GUC : listen_addresses Value : *
>>> GUC : local_preload_libraries Value :
>>> GUC : log_autostats Value : off
>>> GUC : log_connections Value : off
>>> GUC : log_disconnections Value : off
>>> GUC : log_dispatch_stats Value : off
>>> GUC : log_duration Value : off
>>> GUC : log_error_verbosity Value : default
>>> GUC : log_executor_stats Value : off
>>> GUC : log_hostname Value : off
>>> GUC : log_min_duration_statement Value : -1
>>> GUC : log_min_error_statement Value : error
>>> GUC : log_min_messages Value : warning
>>> GUC : log_parser_stats Value : off
>>> GUC : log_planner_stats Value : off
>>> GUC : log_rotation_age Value : 1440
>>> GUC : log_rotation_size Value : 0
>>> GUC : log_statement Value : none
>>> GUC : log_statement_stats Value : off
>>> GUC : log_timezone Value : PRC
>>> GUC : log_truncate_on_rotation Value : off
>>> GUC : maintenance_work_mem Value : 65536
>>> GUC : master_directory Value :
>>> GUC : max_appendonly_segfiles Value : 262144
>>> GUC : max_appendonly_tables Value : 10000
>>> GUC : max_connections Value : 1280
>>> GUC : max_files_per_process Value : 150
>>> GUC : max_fsm_pages Value : 200000
>>> GUC : max_fsm_relations Value : 1000
>>> GUC : max_function_args Value : 100
>>> GUC : max_identifier_length Value : 63
>>> GUC : max_index_keys Value : 32
>>> GUC : max_locks_per_transaction Value : 128
>>> GUC : max_prepared_transactions Value : 250
>>> GUC : max_stack_depth Value : 2048
>>> GUC : max_work_mem Value : 1024000
>>> GUC : metadata_cache_testfile Value :
>>> GUC : optimizer Value : on
>>> GUC : optimizer_analyze_root_partition Value : on
>>> GUC : optimizer_minidump Value : onerror
>>> GUC : optimizer_parts_to_force_sort_on_insert Value : 160
>>> GUC : password_encryption Value : on
>>> GUC : password_hash_algorithm Value : MD5
>>> GUC : pljava_classpath Value :
>>> GUC : pljava_release_lingering_savepoints Value : off
>>> GUC : pljava_statement_cache_size Value : 0
>>> GUC : pljava_vmoptions Value :
>>> GUC : port Value : 6432
>>> GUC : pxf_enable_filter_pushdown Value : on
>>> GUC : pxf_enable_locality_optimizations Value : on
>>> GUC : pxf_enable_stat_collection Value : on
>>> GUC : pxf_remote_service_login Value :
>>> GUC : pxf_remote_service_secret Value :
>>> GUC : pxf_service_address Value : localhost:51200
>>> GUC : pxf_stat_max_fragments Value : 100
>>> GUC : random_page_cost Value : 100
>>> GUC : regex_flavor Value : advanced
>>> GUC : runaway_detector_activation_percent Value : 95
>>> GUC : search_path Value : "$user",public
>>> GUC : seg_max_connections Value : 3000
>>> GUC : segment_directory Value :
>>> GUC : seq_page_cost Value : 1
>>> GUC : server_encoding Value : UTF8
>>> GUC : server_ticket_renew_interval Value : 43200000
>>> GUC : server_version Value : 8.2.15
>>> GUC : server_version_num Value : 80215
>>> GUC : shared_buffers Value : 4000
>>> GUC : shared_preload_libraries Value :
>>> GUC : ssl Value : off
>>> GUC : ssl_ciphers Value : ALL:!ADH:!LOW:!EXP:!MD5:@STRENGTH
>>> GUC : standard_conforming_strings Value : off
>>> GUC : standby_address_host Value : localhost
>>> GUC : statement_timeout Value : 0
>>> GUC : superuser_reserved_connections Value : 3
>>> GUC : tcp_keepalives_count Value : 9
>>> GUC : tcp_keepalives_idle Value : 7200
>>> GUC : tcp_keepalives_interval Value : 75
>>> GUC : temp_buffers Value : 1024
>>> GUC : TimeZone Value : PRC
>>> GUC : timezone_abbreviations Value : Default
>>> GUC : track_activities Value : on
>>> GUC : track_counts Value : off
>>> GUC : transaction_isolation Value : read committed
>>> GUC : transaction_read_only Value : off
>>> GUC : transform_null_equals Value : off
>>> GUC : unix_socket_directory Value :
>>> GUC : unix_socket_group Value :
>>> GUC : unix_socket_permissions Value : 511
>>> GUC : update_process_title Value : on
>>> GUC : vacuum_cost_delay Value : 0
>>> GUC : vacuum_cost_limit Value : 200
>>> GUC : vacuum_cost_page_dirty Value : 20
>>> GUC : vacuum_cost_page_miss Value : 10
>>> GUC : vacuum_freeze_min_age Value : 100000000
>>> GUC : work_mem Value : 51200
>>>
>>>
>>>
>>>
>>>
>>> At 2016-09-21 13:00:41, "Vineet Goel" <vvineet@apache.org> wrote:
>>>
>>> Could you please post your SQL DDL statement? How many URLs do you have
>>> in your external table? Also, your HASH dist table - how many buckets are
>>> defined, if any? Are the # of URLs more than the # of buckets or
>>> default_hash_table_bucket_number value? Perhaps you can attach your
>>> hawq-site.xml file as well.
>>>
>>> Also see:
>>> http://hdb.docs.pivotal.io/20/datamgmt/load/g-gpfdist-protocol.html
>>>
>>> Thanks
>>> Vineet
>>>
>>>
>>> On Tue, Sep 20, 2016 at 7:07 PM 来熊 <yin.zhb@163.com> wrote:
>>>
>>>> Hi,all:
>>>>     I am testing hawq 2.0.0 , and I find a problem like this:
>>>>  I load data from an external table (created using "like target_table"
>>>> statement) ,
>>>> if the target table was distributed by some column(s), it raise this
>>>> error:
>>>>  External scan error: There are more external files (URLs) than primary
>>>> segments that can read them (COptTasks.cpp:1756)
>>>> if the target table was distributed randomly, it works well,
>>>> I don't set any parameter special,does anybody know how to resolve this
>>>> problem?
>>>> thanks a lot.
>>>>
>>>


-- 
Thanks

Hubert Zhang

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message