-
-
Notifications
You must be signed in to change notification settings - Fork 419
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
buffered
performance improvements and bug fix
#2890
Conversation
Follows same convention as `collections/persistent` by creating a `benchmarks` subdirectory.
This commit enhances the performance of the `buffered` package by adding pointer conversion based reads and writes. The new functions are limited to only work for `Array[U8]`s. This commit also adds in the ability to convert numeric array types between each other for reading.
Prior to this commit, if an unsuccessful search was performed, the `_distance_of` function would cache the last node checked for a value and future searches would resume from the next node after the cached node. Unfortunately, this results in incorrect behavior (an error) if someone were to search for a value that doesn't exist in the buffer and then search for a value that does exist in the buffer. This is because the search for the existing value picks up where the search for the non-existing value left off and reaches the end of the buffer and determines that the value doesn't exist. However, if the search were to be conducted starting from the current position in the buffer, then the existing value would be correctly found by the `_distance_of` function. This commit fixes this by removing the caching and always having `_distance_of` restart it's search from the current position in the buffer. It also add a couple of test cases to explicitly check that the searches are working correctly.
packages/buffered/writer.pony
Outdated
// into a single bigger array | ||
if data.size() <= 64 then | ||
match data | ||
| let d: String => let a = d.array(); _current.copy_from(a, 0, _current.size(), a.size()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is wider than 80 columns, and it has two statements separated by a semicolon, which is usually a discouraged style.
You should probably split it onto separate lines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
packages/builtin/array.pony
Outdated
""" | ||
let bytes = _element_size() | ||
let u16_bytes = U16(0).bytewidth() | ||
if ((offset*bytes) + u16_bytes) <= (_size*bytes) then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please surround the *
operator with a space on each side, here and in the other places in this file below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
packages/builtin/array.pony
Outdated
""" | ||
Reads a U16 from offset. This is only allowed for an array of U8s. | ||
""" | ||
let bytes = _element_size() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this function is only allowed for Array[U8]
, why do we need _element_size()
here? Won't it always be 1
? Or am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, _element_size()
will always be 1
. I had originally written these functions to work for all numeric array types and then later decided to limit them only to Array[U8]
.
I left _element_size()
because I had assumed the compiler would optimize it away.
Let me know if you would prefer I replace _element_size()
with 1
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure to what degree it will get optimized away at runtime - I'm mainly concerned with making sure the code as no more complex than it needs to be, for maintainability reasons.
So my vote would be to remove it, along with the multiplication operations below where the bytes
local is used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Complexity removed.
@@ -0,0 +1,13 @@ | |||
primitive NumericArrayConverter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't seem to be required for the buffered
perf improvements, and as a new feature it should probably go through the RFC process (or, at the very least, be in a separate PR).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed. It was included because this was my initial plan/approach to solving the problem before adding in the functions on Array[U8]
.
@@ -752,7 +752,7 @@ class iso _TestFileMixedWriteQueue is UnitTest | |||
line3 + "\n" | |||
line5 | |||
line1 | |||
line3 + "\n" | |||
line3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this change related to the others?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is due to the Writer
now coalescing write()
buffers into one big buffer if the buffer
s being added are smaller than a cacheline (https://github.com/ponylang/ponyc/pull/2890/files#diff-761128221a9d08227d2ae82f79a30ea2R258).
The fact that the output of Writer.done()
can be split across multiple Array[ByteSeq]
s is an implementation detail (one which changes slightly due to the new coalescing behavior of Writer.write()
).
The test being fixed relies on the printv()
call adding in \n
after every individual buffer returned from Writer.done()
and the new coalescing only produces 1 buffer instead of 2 resulting in the test failing without this fix.
@jemc are you good with these changes? |
@dipinhora can you add release notes that should be included with this release into this PR? |
@SeanTAllen is the commit message not adequate for what the release notes require? |
@dipinhora that's a lot of information, something paired down to a short paragraph would be best. |
@SeanTAllen How's the following?
|
This PR makes a number of enhancements to the
buffered
package.Add benchmarks to buffered package
Follows same convention as
collections/persistent
by creating abenchmarks
subdirectory.Efficient buffer reading and writing + numeric array conversions
This commit enhances the performance of the
buffered
package byadding pointer conversion based reads and writes. The new functions
are limited to only work for
Array[U8]
s.This commit also adds in the ability to convert numeric array types
between each other for reading.
Fix Files.mixedwrite test for coalescing buffer behavior.
Fix bug related to searching within a Reader
Prior to this commit, if an unsuccessful search was performed,
the
_distance_of
function would cache the last node checkedfor a value and future searches would resume from the next node
after the cached node. Unfortunately, this results in incorrect
behavior (an error) if someone were to search for a value that
doesn't exist in the buffer and then search for a value that
does exist in the buffer. This is because the search for the
existing value picks up where the search for the non-existing
value left off and reaches the end of the buffer and determines
that the value doesn't exist. However, if the search were to be
conducted starting from the current position in the buffer, then
the existing value would be correctly found by the
_distance_of
function.
This commit fixes this by removing the caching and always having
_distance_of
restart it's search from the current position inthe buffer. It also add a couple of test cases to explicitly
check that the searches are working correctly.
The following are some very unscientific/not controlled benchmark results both before and after the changes to illustrate the performance gains.
Benchmarks before these changes:
Benchmarks after these changes: