`TryteString.as_string` needs some re-branding #90

todofixthis · 2017-10-31T20:12:50Z

The TryteString.as_string desperately needs to be renamed; a lot of users are confusing it with __str__.

The text was updated successfully, but these errors were encountered:

todofixthis · 2017-10-31T20:13:51Z

My recommendation is to call it decode:

Users will be familiar with this method because it is built into Python strings (and TryteString is supposed to "feel" like the Tangle version of a Python string).
Users who are familiar with how bytes.decode works will also be able to grasp more easily that this is a "trytes -> bytes -> characters" process (especially once TryteString.as_bytes() does not work as expected #62 is implemented), so they are less likely to confuse it with __str__.

mlouielu · 2017-11-01T04:33:42Z

Conclusion

+1 for rename as_string to decode, and rename as_bytes to encode.

For what I think TryteString should act like this:

>>> import iota
>>> ts = iota.codecs.encode(b'EXAMPLE'.decode('ascii'), 'utf-8')  # Return TryteString
>>> ts = iota.codecs.encode('EXAMPLE', 'utf-8')                    # Return TryteString
>>> ts = iota.TryteString.from_string('EXAMPLE')
>>> ts = iota.TryteString.from_bytes(b'*\x15d\x96\xb5\x121\x8b\x01')
>>> ts = iota.TryteString('OBGCKBWBZBVBOB')
iota.TryteString('OBGCKBWBZBVBOB')
>>> ts.encode()                          # encode "tryte-string" to "tryte-in-bytes"
b'*\x15d\x96\xb5\x121\x8b\x01'
>>> ts.decode('utf-8')                   # decode "tryte-string" to "str"
>>> ts.decode()                          # default with utf-8
'EXAMPLE'
>>> str(ts)
'OBGCKBWBZBVBOB'
>>> bytes(ts)                           # Not b'OBGCKBWBZBVBOB'
b'*\x15d\x96\xb5\x121\x8b\x01'

Explain

Users are confused between 'EXAMPLE', iota.Hash('EXAMPLE'), iota.Hash(b'EXAMPLE'), str(iota.Hash('EXAMPLE')), bytes(iota.Hash('EXAMPLE')), what is the different between them?

'EXAMPLE': a string, maybe tryte string, or a Python string
iota.Hash('EXAMPLE'): a TryteString, with its value init with 'EXAMPLE'
iota.Hash(b'EXAMPLE'): a TryteString, with its value init with b'EXAMPLE' (this is same as 'EXAMPLE')
str(iota.Hash('EXAMPLE')): a tryte string in str, from iota.TryteString('EXAMPLE'))
bytes(iota.Hash('EXAMPLE')): a tryte string in bytes, from iota.TryteString('EXAMPLE'))

The point is, TryteString.__init__ input with str or bytes is both acceptable, in here, str and bytes both represent a "tryte string".

There isn't involve any decode/encode. So, str(iota.Hash('EXAMPLE')) will be 'EXAMPLE', and bytes(iota.Hash('EXAMPLE')) will be b'*\x15d\x96\xb5\x121\x8b\x01', is make sense.

But, from_string, as_string involve with encode/deocde, from_string encode input string to utf-8, and pass it to from_bytes, therefore, this is the same:

>>> iota.Hash.from_string('妳好') == iota.Hash.from_bytes('妳好'.encode('utf-8'))
True

The deeper problem here comes from from_bytes. It takes not the "tryte string in bytes format" but "any bytes".

For what I think, we just mess up two different converts in one type. we want to do something like str/bytes -> tryte-string -> TryteString -> bytes, and tryte-string (in strorbytes) -> TryteString.

# str/bytes -> tryte-string
"This is a message from GitHub" -> "CCWCXCGDEAXCGDEAPCEAADTCGDGDPCVCTCEAUCFDCDADEAQBXCHDRBIDQC"

# tryte-string -> TryteString
"CCWCXCGDEAXCGDEAPCEAADTCGDGDPCVCTCEAUCFDCDADEAQBXCHDRBIDQC" -> TryteString("CCWCXCGDEAXCGDEAPCEAADTCGDGDPCVCTCEAUCFDCDADEAQBXCHDRBIDQC")

# TryteString -> bytes
TryteString("CCWCXCGDEAXCGDEAPCEAADTCGDGDPCVCTCEAUCFDCDADEAQBXCHDRBIDQC") -> b'T\xf4\xe6\xcd\xbc\x0bNf.\xcb\xb7\x0bm\xeb@\xce^\x17L\xeb.R\x08&oT.\xe6\x05\x1at\x94R\xe9\x08'

---------

# str/bytes -> tryte-string
"EXAMPLE" -> "OBGCKBWBZBVBOB"

# tryte-string -> TryteString
"OBGCKBWBZBVBOB" -> TryteString("OBGCKBWBZBVBOB")

# tryte-string -> TryteString
"EXAMPLE" -> TryteString("EXAMPLE")
b"EXAMPLE" -> TryteString("EXAMPLE")

BTW, @todofixthis you are using Python 2, right? In Python 3, str can only encode to bytes, and bytes can only decode to str. str can't do decode to unicode. I think that's why I'm stuck in TryteString.decode(), if TryteString act like a Python string, it can't do decode in Python 3...

todofixthis · 2017-11-01T06:12:43Z

These are great ideas, thanks @mlouielu !

tl;dr version: Overall, I think we're in agreement; I just have a couple of minor changes to request:

TryteString.__str__ and TryteString.__bytes__ can stay the way they are.
Use built-in codecs.{de,en}code instead of iota.codecs.{de,en}code. Depending on how PyOTA is using these functions internally, this part could get a bit complicated; might want to wait until we tackle TryteString.as_bytes() does not work as expected #62.
Make sure we can support the legacy ASCII codec (this is more applicable to TryteString.as_bytes() does not work as expected #62 though).

Changes to be made:

Rename TryteString.as_bytes to encode.
Rename TryteString.as_string to decode.

Everything else can stay the way it is — we'll make additional changes for #62, but for #90, I think we only need to rename a couple of methods.

Let's tackle this one item at a time:

1. `init`

I like the idea of TryteString('FOO') == TryteString(b'FOO'). In fact, this is what PyOTA does currently.

2. `str` and `bytes`

To be consistent, __str__ and __bytes__ should either:

Return the ASCII representation of the trytes:

str(TryteString('ZQEHP9QXNTJHDBNZZCEOBHRBNJHDWM'))) == 'ZQEHP9QXNTJHDBNZZCEOBHRBNJHDWM'
bytes(TryteString('ZQEHP9QXNTJHDBNZZCEOBHRBNJHDWM')) == b'ZQEHP9QXNTJHDBNZZCEOBHRBNJHDWM'

OR return the binary representation of the trytes:

str(TryteString('ZQEHP9QXNTJHDBNZZCEOBHRBNJHDWM'))) == '你好，世界！'
bytes(TryteString('ZQEHP9QXNTJHDBNZZCEOBHRBNJHDWM')) == b'\xe4\xbd\xa0\xe5\xa5\xbd\xef\xbc\x8c\xe4\xb8\x96\xe7\x95\x8c\xef\xbc\x81'

I think the former satisfies the Principle of Least Astonishment. Additionally, it conforms to the Zen of Python ("There should be one-- and preferably only one --obvious way to do it.") because we will use encode/decode to get binary representations of TryteStrings anyway.

3. `iota.codecs.encode` and `iota.codecs.decode`

This is not necessary, as we can leverage Python's built-in codecs system.

To decode bytes into trytes:

>>> from codecs import encode, decode

>>> bytes_ = '你好，世界！'.encode('utf-8')
>>> decode(bytes_, 'trytes_binary')
TryteString('ZQEHP9QXNTJHDBNZZCEOBHRBNJHDWM')

# Using legacy ASCII codec:
>>> decode(bytes_, 'trytes_ascii')
TryteString('LH9GYEMHCF9GWHZFEELHVFOEOHNEEEWHZFUD')

To encode strings into trytes:

>>> str_ = '你好，世界！'
>>> encode(str_, 'trytes_binary')
TryteString('ZQEHP9QXNTJHDBNZZCEOBHRBNJHDWM')

# Using legacy ASCII codec:
>>> encode(str_, 'trytes_ascii')
TryteString('LH9GYEMHCF9GWHZFEELHVFOEOHNEEEWHZFUD')

Note: PyOTA already uses decode and encode internally to convert some values, so we might have to get creative here.

4. `TryteString.from_bytes` and `TryteString.from_string`

I think we're in alignment here; I just need to make one minor tweak, because we also have to support the legacy ASCII codec. See next section.

5. `TryteString.encode` replaces `TryteString.as_bytes`

I like the rename, and I think it will resonate with Python users; it is the reverse of decode(bytes_, 'trytes_binary') from the example above:

decode(bytes_, 'trytes_binary').encode('trytes_binary') == bytes_
decode(bytes_, 'utf-8').encode('utf-8') == bytes_

We will need to support the legacy ASCII codec, so there needs to be an optional argument to that method:

## Using binary codec (default):
>>> TryteString('ZQEHP9QXNTJHDBNZZCEOBHRBNJHDWM').encode()
>>> TryteString('ZQEHP9QXNTJHDBNZZCEOBHRBNJHDWM').encode('trytes_binary')
b'\xe4\xbd\xa0\xe5\xa5\xbd\xef\xbc\x8c\xe4\xb8\x96\xe7\x95\x8c\xef\xbc\x81'

# Using legacy ASCII codec:
>>> TryteString('LH9GYEMHCF9GWHZFEELHVFOEOHNEEEWHZFUD').encode('trytes_ascii')
b'\xe4\xbd\xa0\xe5\xa5\xbd\xef\xbc\x8c\xe4\xb8\x96\xe7\x95\x8c\xef\xbc\x81'

6. `TryteString.decode` replaces `TryteString.as_string`

Similar comments as the previous section.

- `TryteString.from_string` is now `from_unicode`. - `TryteString.as_string` is now `decode`. - `TryteString.as_bytes` is now `encode`. - Original methods are still available, but deprecated.

todofixthis · 2018-01-06T04:00:02Z

Summary of changes:

Rename TryteString.from_string to from_unicode.
Rename TryteString.as_bytes to encode.
Rename TryteString.as_string to decode.
Add deprecated versions of the renamed functions.

todofixthis · 2018-01-06T04:13:53Z

Scheduled for release: 2.0.4

- `TryteString.from_string` is now `from_unicode`. - `TryteString.as_string` is now `decode`. - `TryteString.as_bytes` is now `encode`. - Original methods are still available, but deprecated.

todofixthis added the enhancement label Oct 31, 2017

todofixthis changed the title ~~TryteString.as_string needs a makeover~~ TryteString.as_string needs some re-branding Oct 31, 2017

This was referenced Dec 27, 2017

TryteString.as_bytes() does not work as expected #62

Open

Converting back to the original form vbakke/trytes#3

Open

todofixthis mentioned this issue Jan 6, 2018

Give TryteString a thesaurus #133

Merged

todofixthis closed this as completed Jan 6, 2018

todofixthis mentioned this issue Jan 26, 2018

Release/2.0.4 #149

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`TryteString.as_string` needs some re-branding #90

`TryteString.as_string` needs some re-branding #90

todofixthis commented Oct 31, 2017

todofixthis commented Oct 31, 2017 •

edited

Loading

mlouielu commented Nov 1, 2017 •

edited

Loading

todofixthis commented Nov 1, 2017 •

edited

Loading

todofixthis commented Jan 6, 2018

todofixthis commented Jan 6, 2018

TryteString.as_string needs some re-branding #90

TryteString.as_string needs some re-branding #90

Comments

todofixthis commented Oct 31, 2017

todofixthis commented Oct 31, 2017 • edited Loading

mlouielu commented Nov 1, 2017 • edited Loading

Conclusion

Explain

todofixthis commented Nov 1, 2017 • edited Loading

1. __init__

2. __str__ and __bytes__

3. iota.codecs.encode and iota.codecs.decode

4. TryteString.from_bytes and TryteString.from_string

5. TryteString.encode replaces TryteString.as_bytes

6. TryteString.decode replaces TryteString.as_string

todofixthis commented Jan 6, 2018

todofixthis commented Jan 6, 2018

`TryteString.as_string` needs some re-branding #90

`TryteString.as_string` needs some re-branding #90

todofixthis commented Oct 31, 2017 •

edited

Loading

mlouielu commented Nov 1, 2017 •

edited

Loading

todofixthis commented Nov 1, 2017 •

edited

Loading

1. `init`

2. `str` and `bytes`

3. `iota.codecs.encode` and `iota.codecs.decode`

4. `TryteString.from_bytes` and `TryteString.from_string`

5. `TryteString.encode` replaces `TryteString.as_bytes`

6. `TryteString.decode` replaces `TryteString.as_string`