falcon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Venkat Ranganathan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FALCON-2104) Loss of data in GraphDB when upgrading Falcon from 0.9 to 0.10.
Date Sat, 30 Jul 2016 05:22:20 GMT

    [ https://issues.apache.org/jira/browse/FALCON-2104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15400464#comment-15400464

Venkat Ranganathan commented on FALCON-2104:

>From the titan upgrade [instructions|http://s3.thinkaurelius.com/docs/titan/0.5.4/upgrade.html]

.5.1. From 0.4.x and previous

D.5.1.1. API Upgrade

Titan 0.5.0 has introduced a number of new features and seen significant changes to the API.
Refer to the documentation for a detailed description of Titan 0.5.0.

Most importantly, Titan 0.5.0 introduces the management system which should be used for schema
creation, index building, and other administrative tasks that affect the entire graph. The
management system is accessed through g.getManagementSystem() which returns a management transaction
that behaves like a normal transaction but provides additional features.

In a management transaction, edge labels and property keys are created with the methods makeEdgeLabel(String)
and makePropertyKey(String) respectively. Instead of multiple methods for specifying the multiplicity
or cardinality, the builders returned by the respective methods now feature a multiplicity
method and a cardinality method, each of which expects an enum argument. The names of the
methods in Titan 0.4.x are virtually identical to the enum constants used in Titan 0.5.0.

Note, that schema type definition and index building have been separated in Titan 0.5.0. In
older versions, one would call sortKey to build a vertex-centric index for an edge label and
indexed to build a global graph index for property keys. These methods have disappeared from
the builder and indexes are build separately using the management system. Refer to the relevant
sections of the documentation to learn more about building indexes.

\[Note\]	Note
It is still possible to define types in an expicit TitanTransaction, however, it is strongly
encouraged to use this method only for those use cases where schema type creation is part
of normal user transactions. In all other cases Titan’s management system should be used.

Titan 0.5.0 introduces vertex labels as first class schema types. Many applications build
on Titan have developed their own vertex labeling / typing system through vertex properties
or some other means. It is highly recommended to transition to the native vertex labels supported
in Titan 0.5.0.

In previous versions of Titan, the entire graph database configuration had to be provided
in a configuration file for each started Titan instance. Starting in 0.5.0, Titan has a central
configuration system which stores all configuration properties that must be coordinated across
instances. These are initialized from the configuration file used to start the first Titan
instance in a new cluster. After that, additional Titan instances only need a minimal configuration
to connect to the cluster. Note, that changing global configuration options can no longer
be accomplished through changes to the local configuration files. Such changes must now be
made through Titan’s management system.

Some configuration options have been renamed and new options have been added. See the Chapter
12, Configuration Reference for an up-to-date listing of config options.

FullDouble and FullFloat do no longer exist. Use Double and Float instead which are now serialized
as 4 and 8 byte floating point numbers. In places where Double or Float was used in sort keys
(i.e. as the data type of the property in a sort key), use Decimal and Precision instead,
respectively, because they have a fixd decimal range.

D.5.1.2. Data Upgrade

The data storage format of Titan 0.5.0 is incompatible with previous releases. The 0.5.0 release
does not yet include utilities to automatically convert data stored with previous releases.
This is planned for the 0.5.1 release. If a data upgrade is desired before this release, it
is encouraged to attempt an export from the old version using the graphson format and import
it into a new Titan 0.5.0 graph using Faunus/Titan-Hadoop.

Looking through the changes, it looks like the API updates are not affecting our use case.
- BTW, this was found  while [~bvellanki] was debugging the QE rolling upgrade tests which
have not shown other graphdb issues.

> Loss of data in GraphDB when upgrading Falcon from 0.9 to 0.10. 
> ----------------------------------------------------------------
>                 Key: FALCON-2104
>                 URL: https://issues.apache.org/jira/browse/FALCON-2104
>             Project: Falcon
>          Issue Type: Bug
>    Affects Versions: 0.10
>            Reporter: Balu Vellanki
>            Assignee: Balu Vellanki
>            Priority: Critical
>             Fix For: trunk, 0.10
>   Original Estimate: 48h
>  Remaining Estimate: 48h
> FALCON-1333 (Instance Search feature) requires Falcon to use titan-berkeleyje version
0.5.4 to support indexing. Up until version 0.9 - Falcon used   titan-berkeleyje-jre6 version
0.4.2.  GraphDB created by version 0.4.2 cannot be read by version 0.5.4. When attempting
an upgrade, I realized that entity and instance lineage data created in 0.9 version could
not be read by Falcon in 0.10 version. 
> The only solution seems to provide a tool to do the following
> - Use 0.4.2 version of titan-berkeleyje-jre6 to read the berkeleyDB based graphDB and
create a JSON file with all data.
> - shutdown falcon-0.9, upgrade Falcon.
> - use 0.5.4 version of titan-berkeleyje to read JSON file and repopulate berkeleyDB based
> - restart falcon-0.10
> I will work on the tool, and update release-notes, upgrade instructions accordingly.

This message was sent by Atlassian JIRA

View raw message