Opened 16 years ago

Closed 13 years ago

Last modified 12 years ago

#7704 closed Bug (fixed)

JS comments put after statements break make-messages.py output

Reported by: Robby Dermody Owned by: Ned Batchelder
Component: Internationalization Version: 1.0
Severity: Normal Keywords: djangojs, make-messages
Cc: robbyd@…, ned@…, ionel.mc@… Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: yes
Easy pickings: yes UI/UX: no

Description (last modified by Łukasz Rekucki)

To test, make a JS file (say ``myfile.js``) with the following valid JS content:

.. code-block:: js

    var a = 1;
    if(a != 2 && a != 5) //this comment breaks the file
    {
        //this does not
        alert(gettext("foobar"));
    }

Running ``make-messages.py -d djangojs -a`` will then yield the following output for that (in the ``myfile.js.py`` intermediate file it produces):

.. code-block:: js

    var a = 1;
    if(a != 2 && a != 5) //this comment breaks the file
    {
    #this does not
        alert(gettext("foobar"));
    }


As you can see, the comment after the if statement was not replaced, and since ``xgettext`` is then run in Perl mode, it seems to choke on that input.
The result depends on the exact code: This example will cause only that next ``gettext("foobar")`` not to be generated (ones further down in the code will).
With other code I had that had a similar line, nothing was generated. The failure is silent and the only way to know is by checking the gettext output (or lack thereof :).

This is due to the regexp in make-messages: ``pythonize_re = re.compile(r'\n\s*//')``
and then the replacement code: ``src = pythonize_re.sub('\n#', src)``

That assumes that comments come after newlines. I'm not submitting a patch right now because I'm unsure about the best regexp to use for this that will get all the valid JS comment cases (or if that is even something the django devs want to do). At the very least, if you all choose not to address this in the code, there should be a note in the documentation telling folks to always put JS comments on their own lines.

As ``make-messages.py`` is now included in django-admin AFAIK, I've categorized it to that.

Attachments (2)

makemessages.diff (5.0 KB ) - added by Ned Batchelder 13 years ago.
Adapt makemessages to use JsLexer
jslex.diff (18.9 KB ) - added by Ned Batchelder 13 years ago.
Updated patch: deals with unicode escapes in ids, and fix a doctest.

Download all attachments as: .zip

Change History (23)

comment:1 by Robby Dermody, 16 years ago

Cc: robbyd@… added

comment:2 by anonymous, 16 years ago

I also faced this *bug*.

It also seems that class comments like this one :

/**
 * ***************************** 
 * AddModule main / window
 * @constructor
 * @class MyDesktop.AddModule
 * *****************************
 */ 

breaks the following gettext references. If I remove the "/" in " * AddModule main / window" .. it works.

comment:3 by Sung-jin Hong, 16 years ago

Triage Stage: UnreviewedAccepted

Seems to me like a correct and reproducible bug.

comment:4 by anonymous, 15 years ago

Version: SVN1.0

comment:5 by anonymous, 15 years ago

This feature is completely broken at the current state.

comment:6 by anonymous, 15 years ago

I just quite some time on trying to find out why some strings from my JS files weren't translated.
I was aware of the end of line comment probleme and some other as I post the 07/11/08 08:52:41 comment.

Any way something else was breaking the process.
I figured out that this line

,this.split = elem.split ? true : false

among others causes problem.

Anyway. I asked myself why only "comment out" the JS comments before processing the fake "js.py" file with gettext ?
Seems to me that we can "comment out" all the lines that don't have gettext or ngettext in it.

pythonize_re2 = re.compile('gettext')
src = open(os.path.join(dirpath, file), "r").readlines()
dest = open(os.path.join(dirpath, '%s.py' % file), "wb")
for line in src :
 if not pythonize_re2.search(line):
  dest.write('#%s' % line)
 else:
  dest.write(line)
dest.close()

I went line by line due to my regex bad competences. a bit slower.
I made a quick

grep -r 'gettext' /js/templates | grep '?'

then I change some lines

return values.value ? gettext("Oui") : gettext("Non");

to 

return values.value ? 
 gettext("Oui") : gettext("Non");  

I looked at the latest translation of all JS files to see if 1- it was present (not the case before) or 2- line number was correct.
All files were well parse

Not a solution a quick hack.

xav

comment:7 by Peter Baumgartner, 13 years ago

Severity: Normal
Type: Bug

comment:8 by Ned Batchelder, 13 years ago

Owner: changed from nobody to Ned Batchelder
Status: newassigned

comment:9 by Ned Batchelder, 13 years ago

(edits here incorporated into main description)

Last edited 13 years ago by Ned Batchelder (previous) (diff)

comment:10 by Łukasz Rekucki, 13 years ago

Description: modified (diff)

Fixed formating in description :)

by Ned Batchelder, 13 years ago

Attachment: makemessages.diff added

Adapt makemessages to use JsLexer

comment:11 by Ned Batchelder, 13 years ago

Cc: ned@… added
Has patch: set

Two patches attached: jslex.diff adds a Javascript lexer, with tests, and makemessages.diff uses the new lexer to process Javascript files. This also fixes #14045, #15495, and #15331.

comment:12 by Jannis Leidel, 13 years ago

Patch needs improvement: set

As mentioned on the developers mailing list, I strongly believe that refactoring the i18n tools to use Babel for message extraction instead of shipping an own JavaScript lexer is the favorable way.

Last edited 13 years ago by Jannis Leidel (previous) (diff)

comment:13 by Jannis Leidel, 13 years ago

Component: Core (Management commands)Internationalization

comment:14 by Ned Batchelder, 13 years ago

In the absence of someone working to get Babel integrated with Django, rejecting this patch is the perfect being the enemy of the good, no? Can you identify a problem with this patch? There are lots of problems with the existing trunk code.

in reply to:  14 comment:15 by Jannis Leidel, 13 years ago

Replying to nedbatchelder:

Can you identify a problem with this patch?

Yes, we'd introduce a huge chunk of code that would further manifest the xgettext hack. In other words, I'm not convinced that switching the hack from the Perl to C lexer in gettext is the right approach to solve this problem.

comment:16 by Ned Batchelder, 13 years ago

I understand the philosophical concern. I'm wondering if there's any observable incorrect behavior in the code.

comment:17 by Marc Demierre <marc.demierre@…>, 13 years ago

As the actual documentation does not even state that there are limitations with the message extraction from javascript files, I think that this patch should be used at least until the transition to Babel. The expected behaviour is to extract all messages wrapped in the gettext() function.

The only real solution to the current problem for a developer is not to use the makemessages utility for javascript at all. I think that using the patch would be better than that.

If the patch is not accepted, I propose to update the documentation to clearly state that javacript parsing is a hack and that it is not working properly.

by Ned Batchelder, 13 years ago

Attachment: jslex.diff added

Updated patch: deals with unicode escapes in ids, and fix a doctest.

comment:18 by ionel.mc@…, 13 years ago

Cc: ionel.mc@… added
Easy pickings: unset

comment:19 by anonymous, 13 years ago

Easy pickings: set

comment:20 by Jannis Leidel, 13 years ago

Resolution: fixed
Status: assignedclosed

In [16333]:

Fixed #7704, #14045 and #15495 -- Introduce a lexer for Javascript to fix multiple problems of the translation of Javascript files with xgettext. Many thanks to Ned Batchelder for his contribution of the JsLex library.

comment:14 by Aymeric Augustin, 12 years ago

In [17515]:

Fixed #17451 -- Mentioned the new JavaScript lexer in the release notes. Refs #7704.

Note: See TracTickets for help on using tickets.
Back to Top