pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "A.M. Sabuncu" <amsabu...@gmail.com>
Subject Re: Question about PDFTextStripper.java code
Date Mon, 29 Dec 2014 13:34:29 GMT
Thank you Tilman, will give them a look.  In my case, it is a matter of
putting away the C# context and switching to Java, that is causing me some
frustration up-front.

On Mon, Dec 29, 2014 at 3:31 PM, Tilman Hausherr <THausherr@t-online.de>
wrote:

> If you're really just starting with java, then it might be useful to read
> a tutorial or watch a youtube video. Youtube has more than just cute cat
> videos!
>
> https://www.youtube.com/results?search_query=java+beginner+tutorial
>
> Tilman
>
> Am 29.12.2014 um 14:25 schrieb A.M. Sabuncu:
>
>  Thanks Gilad.  I am in fact setting up Eclipse to attach the source for
>> the
>> library.  Another learning curve though! :)
>>
>> On Mon, Dec 29, 2014 at 3:22 PM, Gilad Denneboom <
>> gilad.denneboom@gmail.com>
>> wrote:
>>
>>  Yes, that's correct. You can see it by debugging the code and looking at
>>> the memory references for each variable. They should be the same.
>>>
>>> On Mon, Dec 29, 2014 at 2:08 PM, A.M. Sabuncu <amsabuncu@gmail.com>
>>> wrote:
>>>
>>>  Gilad, thank you so much.  I am new to Java and have been researching
>>>> pass-by-ref/value topics in Java for the last hour!  Essentially, the
>>>> external variable output is a pointer to the same object as the object
>>>> outputStream, is that correct?  Thanks again.
>>>>
>>>> On Mon, Dec 29, 2014 at 2:57 PM, Gilad Denneboom <
>>>> gilad.denneboom@gmail.com>
>>>> wrote:
>>>>
>>>>  All of these variables are references to the same object, so when the
>>>>> contents of the object are edited inside the writeText function the
>>>>>
>>>> value
>>>
>>>> pointed at by the external variable (*outputStream* in getText) are
>>>>>
>>>> changed
>>>>
>>>>> as well.
>>>>> In other words, when *outputStream *is assigned to *output *(inside
>>>>> writeText) all it says is for that variable to point to the same object
>>>>> reference. It does not create a copy of the variable under a new name.
>>>>>
>>>>> On Mon, Dec 29, 2014 at 1:43 PM, A.M. Sabuncu <amsabuncu@gmail.com>
>>>>>
>>>> wrote:
>>>>
>>>>> I am reading the PDFTextStripper.java code and I am stuck trying to
>>>>>> understand a mechanism used within the code.
>>>>>>
>>>>>> Following is the getText() method:
>>>>>>
>>>>>>      public String getText( PDDocument doc ) throws IOException
>>>>>>      {
>>>>>>          StringWriter outputStream = new StringWriter();
>>>>>>          writeText( doc, outputStream );
>>>>>>          return outputStream.toString();
>>>>>>      }
>>>>>>
>>>>>> As you can see, getText() calls writeText() with an outputStream.
 In
>>>>>> writeText(), the global variable "Writer output" is set to
>>>>>>
>>>>> outputStream:
>>>>
>>>>>      output = outputStream;
>>>>>>
>>>>>> But there is no code that sets outputStream back to output.
>>>>>>
>>>>> Nevertheless,
>>>>>
>>>>>> outputStream.toString() (in getText) returns the extracted text.
>>>>>>
>>>>>> I know I am missing something here, and any help will be appreciated.
>>>>>>
>>>>> If
>>>>
>>>>> you think I should post this to the developers' list, please let me
>>>>>>
>>>>> know.
>>>>
>>>>> Thanks so much.
>>>>>>
>>>>>> PS: I am using the latest version of PDFBox 1.8.8.
>>>>>>
>>>>>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message