[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [pdf-devel] [pdf_fsys_item_p] function failed with an existing file
From: |
Aleksander Morgado |
Subject: |
Re: [pdf-devel] [pdf_fsys_item_p] function failed with an existing file |
Date: |
Sun, 18 Jan 2009 20:58:45 +0100 |
Hi Antoine,
>
> when using the pdf_fsys_item_p function with an exsiting file, the function
> return PDF_FALSE.
Not sure if that is because the pdf_fsys_item_p function failed or due
to the problems in managing he pdf_text_t strings... see below.
>
> path to file has been initialized with
> pdfHostEncoding = pdf_text_get_host_encoding();
> pdf_text_from_unicode("/path/to/test.pdf", 17, pdfHostEncoding, &text);
>
You can't use a host encoding in pdf_text_from_unicode(). Try to use
the following and let me know if the problem in the pdf_fsys_item_p
continues:
pdf_text_from_unicode("/path/to/test.pdf", 17, PDF_TEXT_UTF8, &text);
> a call to pdf_fsys_item_p(NULL, test) return PDF_FALSE however
> /path/to/test.pdf exists
>
> an strace on the program call show that the access() function is called with
> following parameters :
> access("/path/to/test.pdf \251\4\10q", R_OK) = -1 ENOENT (No such file or
> directory)
>
> It seems that string parameter given to access is incorrect. It looks like
> the string in the pdf_text_t struct is not NULL terminated.
Inside the pdf_text_t, the string is stored as UTF-32HE (either LE or
BE), without any last NUL bytes. Anyway, that data can't be used in
any system call, as it must be converted before to host-encoding
(posix systems) or to UTF-16LE (for windows).
>
> Also, when getting pdf_text_t string to output the result, like this :
> pdf_text_get_host(&pdfString, &pdfStringSize, inputFile, pdfHostEncoding);
>
> a printf("%d - %s", pdfStringSize, pdfString) gives
> 17 - /path/to/test.pdf \251\4\10q
>
> I have to do pdfString[pdfStringSize] = 0 before calling the printf function
> to have the correct output :
> 17 - /path/to/test.pdf
Yes, that's right. Returned string doesn't have last NUL byte. You are
limited in the way that you can't take the output of
pdf_text_get_host() and use it in printf() call. You can't do it for
example with a UTF-32 string, as having NUL bytes is more than common
in that encoding, and printf() would detect end-of-string there. If
you want to have a string so that you can printf it, try to use
pdf_text_get_unicode() function requesting for UTF-8 strings. Also, if
you use the PDF_TEXT_UNICODE_WITH_NUL_SUFFIX option in this last
function, the last NUL byte will be appended to the string, so that
you can use it in printf for example.
Cheers,
-Aleksander