Opened 7 years ago

Closed 4 years ago

Last modified 3 years ago

#7704 closed Bug (fixed)

JS comments put after statements break make-messages.py output

Reported by: robbyd Owned by: nedbatchelder
Component: Internationalization Version: 1.0
Severity: Normal Keywords: djangojs, make-messages
Cc: robbyd@…, ned@…, ionel.mc@… Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: yes
Easy pickings: yes UI/UX:

Description (last modified by lrekucki)

To test, make a JS file (say myfile.js) with the following valid JS content:

var a = 1;
if(a != 2 && a != 5) //this comment breaks the file
{
    //this does not
    alert(gettext("foobar"));
}

Running make-messages.py -d djangojs -a will then yield the following output for that (in the myfile.js.py intermediate file it produces):

var a = 1;
if(a != 2 && a != 5) //this comment breaks the file
{
#this does not
    alert(gettext("foobar"));
}

As you can see, the comment after the if statement was not replaced, and since xgettext is then run in Perl mode, it seems to choke on that input. The result depends on the exact code: This example will cause only that next gettext("foobar") not to be generated (ones further down in the code will). With other code I had that had a similar line, nothing was generated. The failure is silent and the only way to know is by checking the gettext output (or lack thereof :).

This is due to the regexp in make-messages: pythonize_re = re.compile(r'\n\s*//') and then the replacement code: src = pythonize_re.sub('\n#', src)

That assumes that comments come after newlines. I'm not submitting a patch right now because I'm unsure about the best regexp to use for this that will get all the valid JS comment cases (or if that is even something the django devs want to do). At the very least, if you all choose not to address this in the code, there should be a note in the documentation telling folks to always put JS comments on their own lines.

As make-messages.py is now included in django-admin AFAIK, I've categorized it to that.

Attachments (2)

makemessages.diff (5.0 KB) - added by nedbatchelder 4 years ago.
Adapt makemessages to use JsLexer
jslex.diff (18.9 KB) - added by nedbatchelder 4 years ago.
Updated patch: deals with unicode escapes in ids, and fix a doctest.

Download all attachments as: .zip

Change History (23)

comment:1 Changed 7 years ago by robbyd

  • Cc robbyd@… added
  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset

comment:2 Changed 7 years ago by anonymous

I also faced this *bug*.

It also seems that class comments like this one :

/**
 * ***************************** 
 * AddModule main / window
 * @constructor
 * @class MyDesktop.AddModule
 * *****************************
 */ 

breaks the following gettext references. If I remove the "/" in " * AddModule main / window" .. it works.

comment:3 Changed 7 years ago by serialx

  • Triage Stage changed from Unreviewed to Accepted

Seems to me like a correct and reproducible bug.

comment:4 Changed 7 years ago by anonymous

  • Version changed from SVN to 1.0

comment:5 Changed 7 years ago by anonymous

This feature is completely broken at the current state.

comment:6 Changed 7 years ago by anonymous

I just quite some time on trying to find out why some strings from my JS files weren't translated.
I was aware of the end of line comment probleme and some other as I post the 07/11/08 08:52:41 comment.

Any way something else was breaking the process.
I figured out that this line

,this.split = elem.split ? true : false

among others causes problem.

Anyway. I asked myself why only "comment out" the JS comments before processing the fake "js.py" file with gettext ?
Seems to me that we can "comment out" all the lines that don't have gettext or ngettext in it.

pythonize_re2 = re.compile('gettext')
src = open(os.path.join(dirpath, file), "r").readlines()
dest = open(os.path.join(dirpath, '%s.py' % file), "wb")
for line in src :
 if not pythonize_re2.search(line):
  dest.write('#%s' % line)
 else:
  dest.write(line)
dest.close()

I went line by line due to my regex bad competences. a bit slower.
I made a quick

grep -r 'gettext' /js/templates | grep '?'

then I change some lines

return values.value ? gettext("Oui") : gettext("Non");

to 

return values.value ? 
 gettext("Oui") : gettext("Non");  

I looked at the latest translation of all JS files to see if 1- it was present (not the case before) or 2- line number was correct.
All files were well parse

Not a solution a quick hack.

xav

comment:7 Changed 4 years ago by baumer1122

  • Severity set to Normal
  • Type set to Bug

comment:8 Changed 4 years ago by nedbatchelder

  • Owner changed from nobody to nedbatchelder
  • Status changed from new to assigned

comment:9 Changed 4 years ago by nedbatchelder

(edits here incorporated into main description)

Last edited 4 years ago by nedbatchelder (previous) (diff)

comment:10 Changed 4 years ago by lrekucki

  • Description modified (diff)

Fixed formating in description :)

Changed 4 years ago by nedbatchelder

Adapt makemessages to use JsLexer

comment:11 Changed 4 years ago by nedbatchelder

  • Cc ned@… added
  • Has patch set

Two patches attached: jslex.diff adds a Javascript lexer, with tests, and makemessages.diff uses the new lexer to process Javascript files. This also fixes #14045, #15495, and #15331.

comment:12 Changed 4 years ago by jezdez

  • Patch needs improvement set

As mentioned on the developers mailing list, I strongly believe that refactoring the i18n tools to use Babel for message extraction instead of shipping an own JavaScript lexer is the favorable way.

Last edited 4 years ago by jezdez (previous) (diff)

comment:13 Changed 4 years ago by jezdez

  • Component changed from Core (Management commands) to Internationalization

comment:14 follow-up: Changed 4 years ago by nedbatchelder

In the absence of someone working to get Babel integrated with Django, rejecting this patch is the perfect being the enemy of the good, no? Can you identify a problem with this patch? There are lots of problems with the existing trunk code.

comment:15 in reply to: ↑ 14 Changed 4 years ago by jezdez

Replying to nedbatchelder:

Can you identify a problem with this patch?

Yes, we'd introduce a huge chunk of code that would further manifest the xgettext hack. In other words, I'm not convinced that switching the hack from the Perl to C lexer in gettext is the right approach to solve this problem.

comment:16 Changed 4 years ago by nedbatchelder

I understand the philosophical concern. I'm wondering if there's any observable incorrect behavior in the code.

comment:17 Changed 4 years ago by Marc Demierre <marc.demierre@…>

As the actual documentation does not even state that there are limitations with the message extraction from javascript files, I think that this patch should be used at least until the transition to Babel. The expected behaviour is to extract all messages wrapped in the gettext() function.

The only real solution to the current problem for a developer is not to use the makemessages utility for javascript at all. I think that using the patch would be better than that.

If the patch is not accepted, I propose to update the documentation to clearly state that javacript parsing is a hack and that it is not working properly.

Changed 4 years ago by nedbatchelder

Updated patch: deals with unicode escapes in ids, and fix a doctest.

comment:18 Changed 4 years ago by ionel.mc@…

  • Cc ionel.mc@… added
  • Easy pickings unset

comment:19 Changed 4 years ago by anonymous

  • Easy pickings set

comment:20 Changed 4 years ago by jezdez

  • Resolution set to fixed
  • Status changed from assigned to closed

In [16333]:

Fixed #7704, #14045 and #15495 -- Introduce a lexer for Javascript to fix multiple problems of the translation of Javascript files with xgettext. Many thanks to Ned Batchelder for his contribution of the JsLex library.

comment:14 follow-up: Changed 3 years ago by aaugustin

In [17515]:

Fixed #17451 -- Mentioned the new JavaScript lexer in the release notes. Refs #7704.

Note: See TracTickets for help on using tickets.
Back to Top