Re: [Texmacs-dev] working around arXiv' overshoot

Date: Sat, 20 Nov 2021 23:27:59 +0100
Did you use your institute email address to report the issue to arXiv, 
including the submission number? They somehow say that once we encountered an 
issue, we should give them the submission number. I will have a try to write 
them an email from my side, but unfortunately, I don't have any recent paper to 
upload, so it is unclear whether it would lead to anything.


On 11/20/21, Basile Audoly wrote:
I tried some to write to arXiv' support recently as well, see below.
To me, there is no point in insisting, it is clear that they won't do anything 
to help us.

>Where they say that they accept PDF which is clearly not true, since they do 
not accept our PDF files.
>They explicitly say:
>"Our goal is to store articles in formats that are highly portable and stable over 
time. Currently, the best choice is TeX/LaTeX."
Thank you Massimiliano.  After reviewing their strategy, I do not see how arXiv 
is actually interested in Free Software.  But
actual progress has- been happenning at Rice University, at MIT, Poland and 
>I make you ponder the fact that if nobody tries to challenge the status quo the 
current best choice will be the choice for the next 50 years. Irrespective of its 
>technical merits. I do not see how this what free software is about. It should be 
more about allowing people to use the best public technologies to express >themselves 
and leave to the technology the burden of uninteresting tasks.
>Personally I would be satisfied if a subset of TeX/LaTeX could be agreed upon which 
allows to use it as a document exchange/storage format. (Think about >the passage 
from PS to PDF). I think this is a more worthy endeavour than to criticise our, in my 
opinion, fair right to request that arXiv remove a bug in their >PDF-analysis program.
Although not taking the source specific criticism is unfortunate, I can see the 
basis of your request and consider it valid.
I will look into this as a worthy endeavour.
>Best regards,
>Massimiliano Gubinelli
>PS: recently with some colleagues we submitted a paper to a (good) mathematical journal, written in 
LaTeX (i.e. no TeXmacs since my coauthors do not use >it). The proofs which came back from the editor 
were a complete mess, it required a 10 pages letter to indicate all the corrections to be made and a 
couple >of weeks of back/forth mail exchange. So you see, when you look in the details, things are 
not so nice as they seems. Not to mention that unless you install >on your machine ~3GB of useless 
packages you are not really sure to be able to reproduce a given output from your "highly portable 
and stable" source >code.
I understand your criticism about the useless packages.  Basic Latex should be 
expanded if need be, and a few specific utilities accepted.
But not more than that.  Allowing one to use any package would certainly limit 
the validity of my arguments. The problem of making documents
still prevails after more than forty years.

    >No, it would definitely be easier (and more logical) to have arXiv correct 
the bug that erroneously tags PDF files as being produced by LaTeX once for all, 
>than to require every single TeXmacs submission to be exported to LaTeX.
    >Currently, arXiv is accepting PDF produced by MS word but blocking those 
produced by TeXmacs.
    If that is so, I would criticise harshly their modus-operandi.
    >Your point about the TeXmacs format sounds unfair to me. TeXmacs is far more 
structured than LaTeX which does not follow a well-defined grammar. >Parsing 
meta-data from TeXmacs source is trivial.
    Beyond the technical, there are other considerations.  Whilst it could be 
well structured according to some software engineering metric.
    But its syntactic format is not more accessible to code modification than 
tex or latex, or to any other programming language.  I might
    understand the production of an internal format that fits in with a more 
structured format for what you want to do.  But in doing so, you
    have removed an important aspect.  Whilst it could be acceptable with 
schools and colleges, I do not see how to avoid the observation
    I have made.
    Myself, I would reject outputs from GUI that I would not be able to modify 
myself using a basic editor. Whilst I understand the work people
    have put in it, introducing a syntactic format for users to change directly 
would allow others to understand the document without superficial
    I an not convinced that my evaluation has been unfair.  arXiv is certainly 
being unfair accepting outputs from MS.  Does arXiv accept docements
    and source done with texinfo commands?
    >Best wishes,

        Would it not be easier if you can export a latex version, and send that 
for arXiv.  TeXmacs has its own typesetting engine, but it's format
        as with xml is almost impossible to work with from source.  That is the 
fundamental criticism about TeXmacs.  It's design is not
        germaine to use from source.
        Dear TeXmacs users,
        arXiv is blocking submissions from TeXmacs. I have submitted for the 
following bug report with them. I would encourage you to send a similar email 
if you encounter difficulties. Maybe they will take it seriously if many of us 
        Best wishes,
        I am writing my papers using GNU TeXmacs (www.texmacs.org 
<http://www.texmacs.org/>), a wonderful, free, multi-platform editor for 
scientific documents. Its name might be misleading but TeXmacs is *not* related to 
TeX/LaTeX: it has its own typesetting engine.
        When I submit to arXiv a PDF file produced with TeXmacs, it is 
incorrectly recognized as being produced by LaTeX, and the submission process 
is blocked until I provide the latex sources which do not exist. This makes it 
effectively impossible to submit TeXmacs document to arXiv. This is very 
unfortunate as TeXmacs is a great software, arguably superior to LaTeX in 
several aspects.
        I am attaching a sample TeXmacs document. I am hoping that you can 
either correct your LaTeX detection algorithm so that similar documents do not 
get erroneously blocked, or let me know which properties of the PDF document 
exactly are used to tag it as being produced LaTeX, so I can help the TeXmacs 
developers identify a work-around.
        TeXmacs is a great free software that needs encouragements from our 
community and not additional barriers.
        Best wishes,
        (your name here)

Joris told me that he already did it, with no effect. But some time passed, so 
maybe one can try again. If different people do the same report, it may 
increase the chance of a response, so a good strategy could be that *all* 
interested people submit an issue report.


On 20.11.21 11:31, Frank wrote:

According to
we should report the issue to help@arxiv.org


I could post an issue in the bug tracker of the ArXiv
(https://github.com/arXiv/arxiv-base/issues), but I would like to hear
>from you before I do that. Do you think it is better classifying it
under "bug" or something else?


Am 19.11.2021 um 16:29 schrieb TeXmacs:
Dear Giovanni,

If someone could get a patch accepted that solves this problem at ArXiv,
then that would be very nice.  But I personally have no time for this;
I just changed 'TeXmacs' to 'T e X m a c s' for the Pdf creator;
this is a bit ugly, bit it will do for now.

Best wishes, --Joris

On Fri, Nov 19, 2021 at 02:48:15PM +0000, Giovanni Piredda wrote:
Maybe the examination of the code in https://github.com/arXiv could
also help.

Perhaps they accept pull requests.


Am 19.11.2021 um 11:49 schrieb TeXmacs:
Dear Basile,

Thanks for this extremely useful feedback.

It is still annoying to not use our real name for the creator/producer.
If I make it 'T e X m a c s', then does that work?

Best wishes, --Joris

On Fri, Nov 19, 2021 at 07:49:47AM +0100, Basile Audoly wrote:
Dear Joris,

I did a few more tests. Bottomline is that arXiv rejects any PDF document that matches 
the string "*tex*" in the /Creator or /Producer fields, and the match is 

Changing fonts actually has no effect (contrary to what I was writing before).

To be specific, changing both /Creator or /Producer to 'T.Xmacs' works, but 
changing to 'Texmacs' or 'GNU Texmacs' fails.

TeX Gyre fonts (Bonum) works just as well, even though "TeX" appears in the 
/BaseFont field: this field seems to be ignored (which makes sense).


Hi Basile,

This is a very interesting hack.
It would be nice to have some volunteers to investigate this further.

Question 1: if you replace 'TeXmacs' by 'Texmacs' or 'GNU Texmacs' instead of 
'T.Xmacs', does it still work?

Question 2: could someone please try with any of the TeX Gyre fonts (which also 
have TeX in their name)?

Question 3: if you replace 'Computer Modern' with 'Knuth's Modern Font' and (if 
necessary) occurrences of
  CMR, CMMI, CMEX, etc. with something else (just on lines with /FontName or 
/BaseFont), then does this
  also do the trick?

Best wishes, --Joris

On Sat, Jul 10, 2021 at 06:52:11PM +0200, Basile Audoly wrote:
Hi everyone,

arXiv incorrectly identifies PDF documents produced by TeXmacs as if they had 
been produced by LaTeX and blocks them, see my previous email.

Here a dirty but simple hack to work around this:
* change the document font from Computer Modern (a.k.a. Roman) to Optima—any 
other font except CM will probably work as well
* export to PDF
* open the PDF document in emacs, search for the string "TeXmacs" and replace the two 
occurrences corresponding to the PDF Creator and Producer with "T.Xmacs"

Of course, this prevents arXiv from being aware of the full content of the 
document. Exporting to LaTeX is still a viable (but more complicated) 


