Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow stdClass objects in payload #54

Closed
codeliner opened this issue Jun 6, 2016 · 35 comments
Closed

Allow stdClass objects in payload #54

codeliner opened this issue Jun 6, 2016 · 35 comments

Comments

@codeliner
Copy link
Member

Arrays are copied instead of passed by reference. If you ensure that the array only contains arrays and scalar types and you pass such an array to an object that only allows read access you easily get an immutable object (like our message classes)
so far so good, BUT:

$json = "{}";
$assoc = json_decode($json, true);
if ($json !== json_encode($assoc)) {
  echo "Empty assoc becomes empty json array\n";
}

$stdClassObj = json_decode($json);
if ($json === json_encode($stdClassObj)) {
  echo "Empty stdclass obj becomes empty json object\n";
}

This means, that I cannot have a message object that is both: immutable and 100% correctly json encodable/decodable at the same time except I use some kind of deep copy mechanism when passing data into the message and out. I can image the latter influences performance.

What can we do to improve the situation?

@codeliner
Copy link
Member Author

IMHO we need to allow \stdClass objects in payload and accept that our messages are not immutable.
Please vote with 👍 and 👎 for this idea!

@bweston92
Copy link
Member

How will you check all properties are also valid, get_class_vars and iterate over them?
Also what if there is a reference to the object it self in there will that cause endless recursion?

@codeliner
Copy link
Member Author

@bweston92
Copy link
Member

Ok, what happens if I do

$obj = new stdClass;
$obj->a = &$obj;

How would assertSubPayload handle it?

@prolic
Copy link
Member

prolic commented Jun 6, 2016

var_dump(json_encode([]));

and

var_dump(json_decode('[]', true));

so how to you come to this?

$json = "{}";

@codeliner
Copy link
Member Author

codeliner commented Jun 6, 2016

@prolic non php system sending a message, for example JS client sends a json body and you turn this into a command payload

@prolic
Copy link
Member

prolic commented Jun 6, 2016

Can be bypassed very simple:

if ($json === "{}") { $json = '[]'; }

@Ocramius
Copy link

Ocramius commented Jun 6, 2016

I wouldn't allow stdClass support at all: it is very error prone and provides no advantages at all. If what you need is immutability, then an ArrayAccess implementation is a better (yet more complicated) solution, but please let's not use PHP's ancient (broken) object model.

@codeliner
Copy link
Member Author

@prolic yeah of course, but my example only illustrates the problem. In a nested structure it becomes hard to do this type conversation manually and it is impossible to do this a generic way because you simply don't have the information if the property of the nested object is an array or object when it is empty

@Ocramius
Copy link

Ocramius commented Jun 6, 2016

Is this just because of json encoding btw? Are there any real world examples on edge-cases that need it?

@prolic
Copy link
Member

prolic commented Jun 6, 2016

About nesting:

'{"foo":{}}' vs '{"foo":[]}', right?

What should the big difference be? Empty array or empty object, I don't care. A getter method on the message object should handle that difference, and decide what do to. I cannot imagine a payload where empty array or empty object make a semantic difference.

@Ocramius
Copy link

Ocramius commented Jun 6, 2016

I cannot imagine a payload where empty array or empty object make a semantic difference.

Same thoughts here, which is why I asked for edge cases

@codeliner
Copy link
Member Author

codeliner commented Jun 6, 2016

@Ocramius yes, it is because of json encoding. And I run into the issue several times.

consider this:

echo json_encode(['config' => ['optionalOptions' => []]]);
var config = JSON.parse(phpResponse);

config.optionalOptions.enableStrictJsonMode = true; //error wrong type, optionalOptions is an array

@codeliner
Copy link
Member Author

@prolic yeah, as long as your message is only handled by php but a message can be handled by any system, you don't know it so you should ensure that your json encoded message contains correct types.
A JSON-Schema validation will fail on consumer side ...

@Ocramius
Copy link

Ocramius commented Jun 6, 2016

Hmm, there are various solutions here:

  • moving to stdClass for everything ("just works" with PHP, but really sucks terribly. Meh.)
  • moving to a proper serializer (which is much much slower, but appropriate)
  • having own ArrayObject-alike structure that implements JsonSerializable (possibly more work, won't pass the array type-hints tho)

@prolic
Copy link
Member

prolic commented Jun 6, 2016

Hard to decide for me, but in general I have a strong feeling against this proposal.
The points of @Ocramius

  • moving to a proper serializer (which is much much slower, but appropriate)
  • having own ArrayObject-alike structure that implements JsonSerializable (possibly more work, won't pass the array type-hints tho)

seem like a better solution to me.

@Ocramius
Copy link

Ocramius commented Jun 6, 2016

Another one: documenting a limitation.

This last one might be a hard one to bite, but it might also be a good temporary solution, waiting until some better approaches emerge. After all, this thread was started only 40 mins ago.

@codeliner
Copy link
Member Author

codeliner commented Jun 6, 2016

@Ocramius I thought about the same: We should add something like that to the documentation: "You should design your message payload/API in a way that empty objects are not possible"

@codeliner
Copy link
Member Author

anyway, I'll play with that idea: having own ArrayObject-alike structure that implements JsonSerializable

@prolic
Copy link
Member

prolic commented Jun 6, 2016

Testet with Zend\Json:

use Zend\Json\Decoder;

var_dump(Decoder::decode('{"foo":[]}'));

var_dump(Decoder::decode('{"foo":{}}'));

var_dump(Decoder::decode('[]'));

var_dump(Decoder::decode('{}'));

So maybe we just switch to this instead of php's native functions and we are done.

@codeliner
Copy link
Member Author

codeliner commented Jun 6, 2016

@prolic the problem is with encoding PHP array to JSON (same problem with Zend\Json):

$obj = ['foo' => []];
json_encode($obj); // '{"foo":[]}'

$obj = ['foo' => new \stdClass];
json_encode($obj); // '{"foo":{}}'

$clientObj = json_decode('{"foo":{}}', true);
json_encode($clientObj); // '{"foo":[]}' <-- client gets different structure back: schema is invalid

$clientObj = json_decode('{"foo":{}}');
json_encode($clientObj); // '{"foo":{}}' <-- client gets same structure back: everything ok

$clientObj = json_decode('{"foo":{}}');
$proophCommand = Command::withPayload($clientObj); //Not allowed because $clientObj contains \stdClass object

Note: the last example does not throw an exception in the first place (except you ensure it in your message implementation) but when you do:
$proophCommand = $proophCommand->withAddedMetadata('some_meta', 'some_meta_value')

@prolic
Copy link
Member

prolic commented Jun 6, 2016

If we have the role that payload has to be array or scalar, this implies
that also in other languages those rules apply. The leads to {"foo":{}}
being invalid payload.
On the other hand {"foo":[]} would be valid and the stdclass on the payload
(not subpayload) can be converted to array.
Am 06.06.2016 18:41 schrieb "Alexander Miertsch" notifications@github.com:

@prolic https://github.com/prolic the problem is with encoding PHP
array to JSON (same problem with Zend\Json):

$obj = ['foo' => []];json_encode($obj); // '{"foo":[]}'$obj = ['foo' => new \stdClass];json_encode($obj); // '{"foo":{}}'$clientObj = json_decode('{"foo":{}}', true);json_encode($clientObj); // '{"foo":[]}' <-- client gets different structure back: schema is invalid$clientObj = json_decode('{"foo":{}}');json_encode($clientObj); // '{"foo":{}}' <-- client gets same structure back: everything ok$clientObj = json_decode('{"foo":{}}');$proophCommand = Command::withPayload($clientObj); //Not allowed because $clientObj contains \stdClass object

Note: the last example does not throw an exception in the first place
(except you ensure it in your message implementation) but when you do:
$proophMessage = $proophMessage->withAddedMetadata('some_meta',
'some_meta_value')


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#54 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/AAYEvA6Q_a053a4pqCsu02yQncZPc6-Jks5qI_lBgaJpZM4Iuqdn
.

@codeliner
Copy link
Member Author

we have the rule that everything insight the payload must be an array or scalar: https://github.com/prooph/common/blob/master/src/Messaging/MessageDataAssertion.php#L82

@codeliner
Copy link
Member Author

codeliner commented Jun 6, 2016

PHP array can have a numeric index => JSON array
PHP array can be a hash map (assoc array) => JSON object

This means, that our definition of an array is not precise enough because we (json_encode) cannot distinguish between these two if PHP array is empty

The solution would be to use PHP \stdClass when it is a (JSON) object and PHP array when it is a (JSON) array, but \stdClass sucks

@prolic
Copy link
Member

prolic commented Jun 6, 2016

Ok but we can add a test if array is assoc or not and use this info before
json encoding.
Am 06.06.2016 19:17 schrieb "Alexander Miertsch" notifications@github.com:

PHP array can have a numeric index = JSON array
PHP array can be a hash map (assoc array) = JSON object

This means, that our definition of an array is not precise enough because
we (json_encode) cannot distinguish between these two if PHP array is empty


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#54 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/AAYEvHL6z2sfg22c4gb9rDbjjnlBgkIfks5qJAGzgaJpZM4Iuqdn
.

@Ocramius
Copy link

Ocramius commented Jun 6, 2016

Yes, that is indeed the reason for going with an alternate data-structure. stdClass being a compromise for performance reasons (works well with internals).

@codeliner is the write to a hashmap the only scenario where this fails?

@prolic
Copy link
Member

prolic commented Jun 6, 2016

@codeliner can you write a test case that fails now and we can try
different implementations to turn the testa green again while testing
performance, too
Am 06.06.2016 19:20 schrieb "Marco Pivetta" notifications@github.com:

Yes, that is indeed the reason for going with an alternate data-structure.
stdClass being a compromise for performance reasons (works well with
internals).

@codeliner https://github.com/codeliner is the write to a hashmap the
only scenario where this fails?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#54 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/AAYEvELXWHB04vSwPXoDuV3IJeTOIzZaks5qJAKRgaJpZM4Iuqdn
.

@prolic
Copy link
Member

prolic commented Jun 6, 2016

About stdclass and immutability, i am just writing fast here without
thinking much, ain't it possible to always use clone so you cannot modify
the payload from outside?
Am 06.06.2016 19:24 schrieb "Sascha-Oliver Prolic" <
saschaprolic@googlemail.com>:

@codeliner can you write a test case that fails now and we can try
different implementations to turn the testa green again while testing
performance, too
Am 06.06.2016 19:20 schrieb "Marco Pivetta" notifications@github.com:

Yes, that is indeed the reason for going with an alternate
data-structure. stdClass being a compromise for performance reasons
(works well with internals).

@codeliner https://github.com/codeliner is the write to a hashmap the
only scenario where this fails?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#54 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/AAYEvELXWHB04vSwPXoDuV3IJeTOIzZaks5qJAKRgaJpZM4Iuqdn
.

@codeliner
Copy link
Member Author

@prolic only deep clone would work with recursive clone. There is also https://github.com/myclabs/DeepCopy by @mnapoli to handle circular references.
Good idea with the test case. I'll provide one or two later.

@Ocramius yeah, only problem is that an empty assoc should result in a JSON object and an empty array should result in a JSON array: $amIAssocOrArray = []; //?
If you have a schema definition at hand (for example a JSON-Schema) you could inspect the schema and determine the correct type, but this is again a performance nightmare.

@codeliner
Copy link
Member Author

I turned back to my problem space with the feedback collected here and thought about the problem again. I agree now that an empty object in a data structure is an edge case and can be avoided easily.

A nested object without any properties can be replaced by null to express the same state:
'{"foo": {}}' => '{"foo": null}'
The client just need to know that but that is a matter of good API documentation.

It is simply too much work required to improve the situation. We should add a note or warning in the docs and link to this issue. I keep it open until docs are updated, but I don't think that we need to investigate further work. It is not worth the effort.
Thx @bweston92 , @prolic , @Ocramius for feedback 👍

@Ocramius
Copy link

Ocramius commented Jun 6, 2016

Glad that the simplest workaround was picked for now: huge rabbit-hole otherwise :-)

@bweston92
Copy link
Member

Still add tests for array's with circular references?

@codeliner
Copy link
Member Author

@bweston92 yeah, will do

@prolic prolic closed this as completed Nov 10, 2016
@eyudkin
Copy link

eyudkin commented Jan 18, 2018

Strongly not agree: null is null, empty object is empty object. Its logically two different things and idea just always use null looks incorrect for me =/
ArrayObject looks pretty good compromise, but for now its impossible too: its not scalar or array..

@codeliner
Copy link
Member Author

@tensor146 That's true, but we decided to not introduce more than needed complexity. We have this mismatch in PHP when working with arrays and json_encode and while there are solutions available no of them is perfect. You can read the details in the history of the issue.

ArrayObject is not an option because it is passed by reference (unlike a real array) so we would lose the simple semi-immutable structure of our message objects. Immutability (or semi-immutability) is more important for us than null vs. empty object.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants