Return-Path: Delivered-To: apmail-incubator-clerezza-dev-archive@minotaur.apache.org Received: (qmail 10612 invoked from network); 10 Mar 2011 09:22:20 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 10 Mar 2011 09:22:20 -0000 Received: (qmail 83951 invoked by uid 500); 10 Mar 2011 09:22:20 -0000 Delivered-To: apmail-incubator-clerezza-dev-archive@incubator.apache.org Received: (qmail 83921 invoked by uid 500); 10 Mar 2011 09:22:20 -0000 Mailing-List: contact clerezza-dev-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: clerezza-dev@incubator.apache.org Delivered-To: mailing list clerezza-dev@incubator.apache.org Received: (qmail 83913 invoked by uid 99); 10 Mar 2011 09:22:20 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 10 Mar 2011 09:22:20 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [209.85.210.175] (HELO mail-iy0-f175.google.com) (209.85.210.175) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 10 Mar 2011 09:22:12 +0000 Received: by iyb26 with SMTP id 26so1432914iyb.6 for ; Thu, 10 Mar 2011 01:21:52 -0800 (PST) MIME-Version: 1.0 Received: by 10.231.195.212 with SMTP id ed20mr5750656ibb.112.1299748912016; Thu, 10 Mar 2011 01:21:52 -0800 (PST) Received: by 10.231.151.208 with HTTP; Thu, 10 Mar 2011 01:21:52 -0800 (PST) X-Originating-IP: [130.60.157.56] Date: Thu, 10 Mar 2011 10:21:52 +0100 Message-ID: Subject: Where to store BLOBs in clerezza? From: Tsuyoshi Ito To: "clerezza-dev@incubator.apache.org" Content-Type: text/plain; charset=ISO-8859-1 hi Currently we are storing BLOBs in graphs as base64Binary literal by default. I am not sure if this is the way to go. I am wondering what other users/developers think about this. i have the following concerns: a) back up graphs (export as turtle) and restoring graphs (PUT rdf+xml or turtle) is cumbersome (takes a long time and consumes a lot of resources), could also lead to out of memory exception (see Andy Seaborne thread concerning tbd) b) filtering, adding and removing triples containing BLOBs (large literals) is slow and can lead to out of memory exception c) when requesting BLOBs via web service literals (BLOBs) have to be converted to byte arrays (NOT sure if js and css are stored as base64Binary literal in the graph but most javascript libs are available as a single large file and therefore is a large literal) d) webpublisher who develops js and css have to update the graphs in order to update the js and css (this is often done by trial and error for IE compatibility). Feedbacks are welcome Cheers Tsuy