hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] Created: (HBASE-809) Deletall doesn't and inserts after delete don't work as expected.
Date Fri, 08 Aug 2008 22:44:44 GMT
Deletall doesn't and inserts after delete don't work as expected.
-----------------------------------------------------------------

                 Key: HBASE-809
                 URL: https://issues.apache.org/jira/browse/HBASE-809
             Project: Hadoop HBase
          Issue Type: Bug
    Affects Versions: 0.2.0
            Reporter: stack
             Fix For: 0.2.1


HBASE-808 describes program run in the below snippet from the list.  Describes unexpected
behaviors.

{code}
..

Test 2

A bit more surprising: I delete my row, using the delete-all command in
shell:


# SHELL 

hbase(main):001:0> scan 'proxy-0.2'
ROW                          COLUMN+CELL
 testrow                     column=bytes:, timestamp=1100, value=valbytes
ts 1100
 testrow                     column=status:, timestamp=1100, value=valstat
ts1100
2 row(s) in 0.3560 seconds
hbase(main):002:0> deleteall 'proxy-0.2', 'testrow'
0 row(s) in 0.1050 seconds
hbase(main):003:0> scan 'proxy-0.2'
ROW                          COLUMN+CELL
0 row(s) in 0.2540 seconds


The table is now empty, and if I try to launch my dumpRowHistory() method,
the emptiness is confirmed. Ok. Now I launch my test 1 again. Restarting
from timestamp 1000:


# OUTPUT

> > Connecting to hbase master...
 > -- Inserted ts 1000
 > Versions or row : testrow
 > -- Inserted ts 1010
 > Versions or row : testrow
 > -- Inserted ts 1020
 > Versions or row : testrow
 > -- Inserted ts 1030
 > Versions or row : testrow
 > -- Inserted ts 1040
 > Versions or row : testrow
 > -- Inserted ts 1050
 > Versions or row : testrow
 > -- Inserted ts 1060
 > Versions or row : testrow
 > -- Inserted ts 1070
 > Versions or row : testrow


It seems that the row are not inserted. Querying from shell:


# SHELL 

hbase(main):004:0> scan 'proxy-0.2'
ROW                          COLUMN+CELL
0 row(s) in 0.2030 seconds


But, If I allow the program to make more iterations than the first time (ts
> > 1100), the newest timestamps are taken in account. As if the table
remembers of the previous maximum value of the timestamp:

Relaunching the code of Test 1 :


# OUTPUT

> > Connecting to hbase master...
 > -- Inserted ts 1000
 > Versions or row : testrow
 > -- Inserted ts 1010
 > Versions or row : testrow
 > -- Inserted ts 1020
 > Versions or row : testrow
 > -- Inserted ts 1030
 > Versions or row : testrow
 > -- Inserted ts 1040
 > Versions or row : testrow
 > -- Inserted ts 1050
 > Versions or row : testrow
 > -- Inserted ts 1060
 > Versions or row : testrow
 > -- Inserted ts 1070
 > Versions or row : testrow
 > -- Inserted ts 1080
 > Versions or row : testrow
 > -- Inserted ts 1090
 > Versions or row : testrow
 > -- Inserted ts 1100
 > Versions or row : testrow
 > #1 MXTS[1100] bytes: => valbytes ts 1100 [1100]
 > #2 MXTS[1090] bytes: => valbytes ts 1090 [1090], status: => valstat
ts1090 [1090]
 > #3 MXTS[1080] bytes: => valbytes ts 1080 [1080], status: => valstat
ts1080 [1080]
 > #4 MXTS[1070] bytes: => valbytes ts 1070 [1070], status: => valstat
ts1070 [1070]
 > #5 MXTS[1060] bytes: => valbytes ts 1060 [1060], status: => valstat
ts1060 [1060]
 > #6 MXTS[1050] bytes: => valbytes ts 1050 [1050], status: => valstat
ts1050 [1050]
 > #7 MXTS[1040] bytes: => valbytes ts 1040 [1040], status: => valstat
ts1040 [1040]
 > #8 MXTS[1030] bytes: => valbytes ts 1030 [1030], status: => valstat
ts1030 [1030]
 > #9 MXTS[1020] bytes: => valbytes ts 1020 [1020], status: => valstat
ts1020 [1020]
 > #10 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat
ts1010 [1010]
 > #11 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
ts1000 [1000]
 > -- Inserted ts 1110
 > Versions or row : testrow
 > #1 MXTS[1110] bytes: => valbytes ts 1110 [1110]
 > #2 MXTS[1100] bytes: => valbytes ts 1100 [1100], status: => valstat
ts1100 [1100]
 > #3 MXTS[1090] bytes: => valbytes ts 1090 [1090], status: => valstat
ts1090 [1090]
 > #4 MXTS[1080] bytes: => valbytes ts 1080 [1080], status: => valstat
ts1080 [1080]
 > #5 MXTS[1070] bytes: => valbytes ts 1070 [1070], status: => valstat
ts1070 [1070]
 > #6 MXTS[1060] bytes: => valbytes ts 1060 [1060], status: => valstat
ts1060 [1060]
 > #7 MXTS[1050] bytes: => valbytes ts 1050 [1050], status: => valstat
ts1050 [1050]
 > #8 MXTS[1040] bytes: => valbytes ts 1040 [1040], status: => valstat
ts1040 [1040]
 > #9 MXTS[1030] bytes: => valbytes ts 1030 [1030], status: => valstat
ts1030 [1030]
 > #10 MXTS[1020] bytes: => valbytes ts 1020 [1020], status: => valstat
ts1020 [1020]
 > #11 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat
ts1010 [1010]
 > #12 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
ts1000 [1000]
 > -- Inserted ts 1120
 > Versions or row : testrow
 > #1 MXTS[1120] bytes: => valbytes ts 1120 [1120]
 > #2 MXTS[1110] bytes: => valbytes ts 1110 [1110], status: => valstat
ts1110 [1110]
 > #3 MXTS[1100] bytes: => valbytes ts 1100 [1100], status: => valstat
ts1100 [1100]
 > #4 MXTS[1090] bytes: => valbytes ts 1090 [1090], status: => valstat
ts1090 [1090]
 > #5 MXTS[1080] bytes: => valbytes ts 1080 [1080], status: => valstat
ts1080 [1080]
 > #6 MXTS[1070] bytes: => valbytes ts 1070 [1070], status: => valstat
ts1070 [1070]
 > #7 MXTS[1060] bytes: => valbytes ts 1060 [1060], status: => valstat
ts1060 [1060]
 > #8 MXTS[1050] bytes: => valbytes ts 1050 [1050], status: => valstat
ts1050 [1050]
 > #9 MXTS[1040] bytes: => valbytes ts 1040 [1040], status: => valstat
ts1040 [1040]
 > #10 MXTS[1030] bytes: => valbytes ts 1030 [1030], status: => valstat
ts1030 [1030]
 > #11 MXTS[1020] bytes: => valbytes ts 1020 [1020], status: => valstat
ts1020 [1020]
 > #12 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat
ts1010 [1010]
 > #13 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
ts1000 [1000]
 > -- Inserted ts 1130
 > Versions or row : testrow
 > #1 MXTS[1130] bytes: => valbytes ts 1130 [1130]
 > #2 MXTS[1120] bytes: => valbytes ts 1120 [1120], status: => valstat
ts1120 [1120]
 > #3 MXTS[1110] bytes: => valbytes ts 1110 [1110], status: => valstat
ts1110 [1110]
 > #4 MXTS[1100] bytes: => valbytes ts 1100 [1100], status: => valstat
ts1100 [1100]
 > #5 MXTS[1090] bytes: => valbytes ts 1090 [1090], status: => valstat
ts1090 [1090]
 > #6 MXTS[1080] bytes: => valbytes ts 1080 [1080], status: => valstat
ts1080 [1080]
 > #7 MXTS[1070] bytes: => valbytes ts 1070 [1070], status: => valstat
ts1070 [1070]
 > #8 MXTS[1060] bytes: => valbytes ts 1060 [1060], status: => valstat
ts1060 [1060]
 > #9 MXTS[1050] bytes: => valbytes ts 1050 [1050], status: => valstat
ts1050 [1050]
 > #10 MXTS[1040] bytes: => valbytes ts 1040 [1040], status: => valstat
ts1040 [1040]
 > #11 MXTS[1030] bytes: => valbytes ts 1030 [1030], status: => valstat
ts1030 [1030]
 > #12 MXTS[1020] bytes: => valbytes ts 1020 [1020], status: => valstat
ts1020 [1020]
 > #13 MXTS[1010] bytes: => valbytes ts 1010 [1010], status: => valstat
ts1010 [1010]
 > #14 MXTS[1000] bytes: => valbytes ts 1000 [1000], status: => valstat
ts1000 [1000]


Since the timestamp reachs a newest value, the row is inserted. Moreover,
the previous insertions appears !

Notice another problem: the last insertion is missing one cell: the
'status:' column.

Using shell to scan the table give the same result:


# SHELL
hbase(main):003:0> scan 'proxy-0.2'
ROW                          COLUMN+CELL
 testrow                     column=bytes:, timestamp=1130, value=valbytes
ts 1130


Relauching hbase with the stop-hbase.sh / start-hbase.sh scripts yields to
another unexpected behaviour:

When I run the scan command in the shell, I have the same result than above:


# SHELL
hbase(main):001:0> scan 'proxy-0.2'
ROW                          COLUMN+CELL
 testrow                     column=bytes:, timestamp=1130, value=valbytes
ts 1130


but if I launch the dumpRowHistory method it appears that most of history of
the status: column is lost.

Notice that I tried many times and I never had the same behaviour twice
here, sometime the other column is missing, or the row is entirely lost
giving no result at all.


# OUTPUT

 > #1 MXTS[1130] bytes: => valbytes ts 1130 [1130]
 > #2 MXTS[1120] bytes: => valbytes ts 1120 [1120], status: => valstat
ts1120 [1120]
 > #3 MXTS[1110] bytes: => valbytes ts 1110 [1110]
 > #4 MXTS[1100] bytes: => valbytes ts 1100 [1100]
 > #5 MXTS[1090] bytes: => valbytes ts 1090 [1090]
 > #6 MXTS[1080] bytes: => valbytes ts 1080 [1080]
 > #7 MXTS[1070] bytes: => valbytes ts 1070 [1070]
 > #8 MXTS[1060] bytes: => valbytes ts 1060 [1060]
 > #9 MXTS[1050] bytes: => valbytes ts 1050 [1050]
 > #10 MXTS[1040] bytes: => valbytes ts 1040 [1040]
 > #11 MXTS[1030] bytes: => valbytes ts 1030 [1030]
 > #12 MXTS[1020] bytes: => valbytes ts 1020 [1020]
 > #13 MXTS[1010] bytes: => valbytes ts 1010 [1010]
 > #14 MXTS[1000] bytes: => valbytes ts 1000 [1000]


I tried other tests, replacing only one column, using an existing timestamp
to modify one single value, inserting past values, and so on... My
conclusion is either I don't understand the general behaviour of that, or I
make a bad usage of the API. 

However, using normal insertion and normal query (I mean without any
timestamp) gives me coherent and predictable results. As well as normal
insertion and querying with past timestamps does.

{code}


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message