ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kim Ebert <kim.eb...@perfectsearchcorp.com>
Subject Re: CTAKES mirroring on github.
Date Thu, 14 May 2015 17:56:04 GMT
I've done some investigation into using / working with the git repo for
cTAKES, and I found that it is a huge. It doesn't work well with GitHub
either, as I keep running into timeouts.

I would like to make the suggest that we remove two cTAKES build files
and the ctakes-gui-0.0.1.zip file. This takes the repo from about 8 GB
down to 1.8 GB. It is likely that the reason the git mirror is failing
is due to the large size of the repo. GitHub will also filter out some
of these vary large files, as GitHub's max file size is 100MB.

git filter-branch --tree-filter 'rm -rf ctakes-gui-0.0.1.zip'
origin/cTAKES-GUI-0.0.1
git filter-branch -f --tree-filter 'rm -rf
_cTAKES_build_/cTAKES-2.5*.zip' origin/maven-sandbox
git filter-branch -f --tree-filter 'rm -rf
_cTAKES_build_/cTAKES-2.5*.zip' origin/SHARPn-cTAKES

# Clean out unreferenced objects from repo
git -c gc.reflogExpire=0 -c gc.reflogExpireUnreachable=0 -c
gc.rerereresolved=0 \
    -c gc.rerereunresolved=0 -c gc.pruneExpire=now gc


It may also be helpful to remove
ctakes-dependency-parser-res/src/main/resources/org/apache/ctakes/dependency/parser/models/clearparser_models.jar
from the git repo as well. (238,248,287 bytes)

Thoughts?

IMAT Solutions <http://imatsolutions.com>
Kim Ebert
Software Engineer
Office: 208.971.1509
kim.ebert@imatsolutions.com <mailto:greg.hubert@imatsolutions.com>
On 05/06/2015 01:17 PM, Steven Bethard wrote:
> Yes, I ping this issue every couple months, but no luck so far. (They
> take a look each time I ask, but haven't yet pushed a working git
> mirror for us.)
>
> Steve
>
> On Tue, May 5, 2015 at 12:09 PM, Kim Ebert
> <kim.ebert@perfectsearchcorp.com> wrote:
>> Ah, looks like the issue is still being looked into.
>>
>> https://issues.apache.org/jira/browse/INFRA-8553
>>
>> On Mon, May 4, 2015 at 4:54 PM, jay vyas <jayunit100.apache@gmail.com>
>> wrote:
>>
>>> Thanks kim.
>>>
>>> Can you file an infra issue ?
>>>
>>> they will look into it.
>>>
>>> I filed one originally
>>> On May 4, 2015 6:32 PM, "Kim Ebert" <kim.ebert@perfectsearchcorp.com>
>>> wrote:
>>>
>>>> It looks like the github hasn't been updated in a while. Any reason?
>>>>
>>>> Thanks,
>>>>
>>>> Kim
>>>>
>>>> On Tue, Feb 17, 2015 at 10:36 AM, Finan, Sean <
>>>> Sean.Finan@childrens.harvard.edu> wrote:
>>>>
>>>>> Our request is for a read-only mirror.  However, if it ever becomes
>>> i/o,
>>>> I
>>>>> don't know if this will have what you want, but http://git.apache.org/
>>>>> Links to documentation (mostly server setup)
>>>>> http://www.apache.org/dev/git.html and a wiki (check toward middle and
>>>>> bottom for committer info) https://wiki.apache.org/general/GitAtApache
>>>>>
>>>>>
>>>>>
>>>>> -----Original Message-----
>>>>> From: Miller, Timothy [mailto:Timothy.Miller@childrens.harvard.edu]
>>>>> Sent: Tuesday, February 17, 2015 12:31 PM
>>>>> To: dev@ctakes.apache.org
>>>>> Subject: Re: CTAKES mirroring on github.
>>>>>
>>>>> Is there any existing resource to help people who want to use git
>>>>> understand the right workflow to contribute to ctakes? (i.e. how this
>>>>> interacts with svn repos).
>>>>> Tim
>>>>>
>>>>>
>>>>> On 02/17/2015 12:23 PM, jay vyas wrote:
>>>>>> Hi CTakes.  Looks like infra finally got  onto the JIRA i made for
>>>>>> this a while back.  They are currently working on fixing a couple
of
>>>>>> minor glitches w/ the mirroring (not showing all commits)... but
>>> there
>>>>>> now is a mirror for CTakes on github.
>>>>>>
>>>>>>
>>>>>>
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache
>>> _ctakes&d=BQIBaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=Heup-
>>> IbsIg9Q1TPOylpP9FE4GTK-OqdTDRRNQXipowRLRjx0ibQrHEo8uYx6674h&m=4sEI9mOp
>>> kTz6K-DjmNU1s8Do1TGA0_10HqJcowKpDxc&s=fNVbyXzpBLSAG6-DIjBZ1vbMp0JGaX90
>>>>>> Lcdzg_EFVvM&e=
>>>>>>
>>>>>


Mime
View raw message