ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kim Ebert <kim.eb...@imatsolutions.com>
Subject Re: CTAKES mirroring on github.
Date Thu, 28 May 2015 16:01:31 GMT
Hi Steve,

It may or may not be the issue. You are right, Infra hasn't given any
reason for the reason that the repo only goes up to August 2013. I
theorize it is the overall repo size causing memory issues to prevent
the repo from going beyond August 2013... but it is just a guess. I was
able on my local machine with large amounts of ram to run git svn fetch
correctly, so it doesn't appear that there is anything corrupt or
problematic with the git svn fetch call itself.

I've had issues with my own personal git repos consuming all the
available memory on VMs before due to large files. Git really doesn't
handle large files well, as it usually tries to put everything into ram
/ swap space. In the case that the entire repo size exceeds the ram /
swap space, git will crash... generally making a mess of things.

Github limiting file size is just an interesting side note.

I'm really interested in making use of the git repo vs the svn repo, so
I'm hoping to get things to move forward here.

IMAT Solutions <http://imatsolutions.com>
Kim Ebert
Software Engineer
Office: 208.971.1509
kim.ebert@imatsolutions.com <mailto:greg.hubert@imatsolutions.com>
On 05/28/2015 09:31 AM, Steven Bethard wrote:
> On Thu, May 14, 2015 at 1:56 PM, Kim Ebert
> <kim.ebert@perfectsearchcorp.com> wrote:
>> I've done some investigation into using / working with the git repo for cTAKES, and
I found that it is a huge. It doesn't work well with GitHub either, as I keep running into
timeouts.
>>
>> I would like to make the suggest that we remove two cTAKES build files and the ctakes-gui-0.0.1.zip
file. This takes the repo from about 8 GB down to 1.8 GB. It is likely that the reason the
git mirror is failing is due to the large size of the repo.
> While I'm all for removing some of the huge files,  note that the file
> size is not the problem. GitHub is mirroring everything (except maybe
> the large files), it's just that git://git.apache.org/ctakes.git is
> not complete. It only goes to up to August 2013.
>
> Steve
>


Mime
View raw message