Validating email addresses in Python and Javascript

On the BOS2 project we have added email address validation to our survey input fields. To achieve this we initially used Django’s EmailValidator but it has some limitations – noticeably it does not impose a length restriction on the local part of the email address (before the ‘@’ symbol), and it also requires a dot (.) in the domain part (so example@com fails to register as valid when it should).

After much searching I found a better RFC822 regular expression rewritten in python but it didn’t enforce any length restrictions. During the course of this investigation I had accumulated enough regular expression knowledge to ‘simply’ add a couple of  zero-width positive look-ahead assertions to the front of this expression. The resulting regular expression still doesn’t cope with comments (e.g. person@example(comment).com) but I think we can live with that for now. I’ve also modified the original to remove the nested character sets which enables us to use the resulting regular expression in Javascript as well 🙂

 

So here is the compiled regular expression:

^(?=^.{1,256}$)(?=.{1,64}@)(?:[^\x00-\x20\x22\x28\x29\x2c\x2e\x3a-\x3c\x3e\x40\x5b-\x5d\x7f-\xff]+|\x22(?:[^\x0d\x22\x5c\x80-\xff]|\x5c[\x00-\x7f])*\x22)(?:\x2e(?:[^\x00-\x20\x22\x28\x29\x2c\x2e\x3a-\x3c\x3e\x40\x5b-\x5d\x7f-\xff]+|\x22(?:[^\x0d\x22\x5c\x80-\xff]|\x5c[\x00-\x7f])*\x22))*\x40(?:[^\x00-\x20\x22\x28\x29\x2c\x2e\x3a-\x3c\x3e\x40\x5b-\x5d\x7f-\xff]+|[\x5b](?:[^\x0d\x5b-\x5d\x80-\xff]|\x5c[\x00-\x7f])*[\x5d])(?:\x2e(?:[^\x00-\x20\x22\x28\x29\x2c\x2e\x3a-\x3c\x3e\x40\x5b-\x5d\x7f-\xff]+|[\x5b](?:[^\x0d\x5b-\x5d\x80-\xff]|\x5c[\x00-\x7f])*[\x5d]))*$

The source to generate it might be useful to others so you can find it here:

http://pastebin.com/X8YexXDK