incubator-lucy-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Balmain" <>
Subject Re: Problematic platforms
Date Wed, 21 Jun 2006 06:44:12 GMT
On 6/21/06, Marvin Humphrey <> wrote:
> Greets,
> There are some portability problems that may not be worth solving.
> On some Crays, ints, longs and pointers are all 8 bytes (the ILP64
> format).  I propose not supporting any machine where we can't
> guarantee that lucy_i8_t is 1 byte and lucy_i32_t is 4 bytes.
> A second esoteric problem is machines that don't use IEEE 754 for
> floats: <>.  I think
> that the norms-encoding routine will break on such machines.  That
> ought to be the only problem, I think but it's gnarly enough I think
> we should just decide not to support those boxes.

Sounds fine to me. If someone needs Lucy to work on one of those
boxes, it will just be a simple matter of them supplying us with
float2byte and byte2float methods.

> Another wrinkle is large file support.  Machines that don't support
> large files are growing scarcer by the day, but eventually, somebody
> who has one will want to use Lucy.  Index files can get pretty big.
> Is it even possible for a machine to have large file support and not
> provide a 64-bit integer?  The only thing Lucene ever uses 64-bit
> integers for is file pointers.  KinoSearch takes advantage of this in
> a weird way -- it uses doubles wherever Lucene uses Java longs.  I
> did it that way because Perl always provides support for doubles, but
> 64-bit integer support takes a special compile and generally doesn't
> work very well.  The 52-bit mantissa in an IEEE 754 double is more
> than enough for any file pointer.  But when I made that call, I was
> using native Perl filehandles as InStream objects; KinoSearch doesn't
> do that anymore, and I don't think we should go the doubles-as-file-
> pointers route with Lucy (even though it Just Works).
> I'm inclined to require both large file support and 64-bit integers
> for Lucy.  What say?

I'm not sure about large file support. You've looked into it more than
I have but I do think 64 bit integers are a must.

[aside:What I'm doing in Ferret is storing all file pointers as off_t.
As well as read/write_vint methods I have read/write_voff_t. The only
time I use 64-bit integers (ie always 64-bit unlike off_t which could
be 32-bit) is when I need to write a fixed byte size pointer like in
the fields and term_vectors index files. I've only just implemented
this but it seems to be working.]

View raw message