#5887 IDNA domains does not work under py3
Closed: fixed 6 years ago Opened 7 years ago by mbasti.

Python3 (Fail):

ipa dnszone-add človek.test

ipa: DEBUG: raw: dnszone_add('\udcc4\udc8dlovek.test', idnssoarname=<DNS name hostmaster>, idnssoaserial=1462978450, idnssoarefresh=3600, idnssoaretry=900, idnssoaexpire=1209600, idnssoaminimum=3600, skip_overlap_check=False, force=False, skip_nameserver_check=False, all=False, raw=False, version='2.167')

ipa: ERROR: invalid 'name': invalid domain name

Python2 (OK):

ipa dnszone-add človek.test

ipa: DEBUG: raw: dnszone_add(u'\u010dlovek.test', idnssoarname=<DNS name hostmaster>, idnssoaserial=1462978514, idnssoarefresh=3600, idnssoaretry=900, idnssoaexpire=1209600, idnssoaminimum=3600, skip_overlap_check=False, force=False, skip_nameserver_check=False, all=False, raw=False, version=u'2.167')

Zone added

The issue was caused by LC_ALL locale without any value

setting LC_ALL="en_US.UTF-8" resolves the issue.

Closing ticket as works for me, because without properly set locale IPA cannot work with non-ASCII characters

Reopening ticket, py2 and py3 should behave in the same way.

This ticket is out of scope of 4.4.0 release. Moving to 4.4.1. Note that 4.4.1 needs to be triaged, therefore not everything will be implemented.

Moving to next major version. Fixing this bug is not critical in stabilization release.

mass moving python3 tickets to FreeIPA 4.6 which should be smaller release targeted mainly on python3 porting.

Metadata Update from @mbasti:
- Issue assigned to mbasti
- Issue set to the milestone: FreeIPA 4.6

7 years ago

Metadata Update from @mbasti:
- Issue close_status updated to: None
- Issue tagged with: py3

6 years ago

Metadata Update from @mbasti:
- Issue assigned to stlaz (was: mbasti)

6 years ago

Metadata Update from @tkrizek:
- Issue set to the milestone: FreeIPA 4.6.1 (was: FreeIPA 4.6)

6 years ago

Metadata Update from @tkrizek:
- Issue set to the milestone: FreeIPA 4.6.2 (was: FreeIPA 4.6.1)

6 years ago

It's not a problem with an empty LC_ALL but with LC_ALL=C

empty LC_ALL

# LC_ALL="" ipa -dd dnszone-add človek.test
...
ipa: DEBUG: raw: dnszone_add('človek.test', version='2.229')
ipa: DEBUG: dnszone_add('človek.test', version='2.229')
...
# ipa dnszone-find
...
  Zone name: človek.test.
  Active zone: TRUE
...

LC_ALL=C

# LC_ALL="C" ipa -dd dnszone-add človek.test
...
ipa: DEBUG: raw: dnszone_add('\udcc4\udc8dlovek2.test', version='2.229')
ipa: DEBUG: dnszone_add('\udcc4\udc8dlovek2.test', version='2.229')
...
ipa: ERROR: invalid 'name': DNS label cannot be longer than 63 characters

Other commands

It affects all command line operation, not just DNS:

LC_ALL="C" ipa -dd user-add test --first Töst --last Üßer
...
ipa: DEBUG: raw: user_add('t\udcc3\udcb6st', givenname='T\udcc3\udcb6st', sn='\udcc3\udc9c\udcc3\udc9fer', version='2.229')
ipa: DEBUG: user_add('t\udcc3\udcb6st', givenname='T\udcc3\udcb6st', sn='\udcc3\udc9c\udcc3\udc9fer', version='2.229')
...
ipa: ERROR: an internal error has occurred
LC_ALL="" ipa -dd user-add test --first Töst --last Üßer
...
ipa: DEBUG: raw: user_add('test', givenname='Töst', sn='Üßer', version='2.229')
ipa: DEBUG: user_add('test', givenname='Töst', sn='Üßer', version='2.229')
...
-----------------
Added user "test"
-----------------
  User login: test
  First name: Töst
  Last name: Üßer
  Full name: Töst Üßer
  Display name: Töst Üßer
  Initials: TÜ
...

PEP 538 explains the issue: https://www.python.org/dev/peps/pep-0538/

refuse ascii encoding and other non-UTF-8 codecs

I recommend that IPA's API refuses to operate with non-UTF-8 codec for file system and IO encoding. LC_ALL=C or non-UTF-88LC_CTYPE affect more than just sys.argv. The settings also affect file name encoding and I/O on standard streams. Users should either set LC_ALL="C.UTF-8" or LC_ALL=C with LC_CTYPE=C.UTF-8. C locales are a common issue in containers.

if sys.getfilesystemencoding().lower() not in {'utf-8', 'utf8'}:
    raise ScriptError("Set LC_ALL=C.UTF-8")

refuse surrogate escapes

We could just refuse surrogate escapes in argv, too. I don't recommend this approach because it can hide bugs with file I/0 and standard streams.

# unicode surrogate escapes '\udc80' to '\udcff'
SURROGATES = set(six.unichr(c) for c in range(0xdc80, 0xdcff+1))
for arg in sys.argv:
    if set(arv).intersection(SURROGATES):
        raise ScriptError("Set LC_ALL=C.UTF-8")

Metadata Update from @cheimes:
- Issue priority set to: critical (was: important)

6 years ago

master:

  • e1bd827 Require UTF-8 fs encoding

Metadata Update from @cheimes:
- Issue close_status updated to: fixed
- Issue status updated to: Closed (was: Open)

6 years ago

ipa-4-6:

  • 1ea1fdd Require UTF-8 fs encoding

Why is LC_ALL/ lang forced to C.UTF-8 when settingLC_CTYPE should be enough?

master:

  • c925b44 Load certificate files as binary data

master:

  • 0a5a7bd Fix test_cli_fsencoding on Python 3.7

ipa-4-7:

  • 55e7a58 Fix test_cli_fsencoding on Python 3.7

Login to comment on this ticket.

Metadata