-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Apache Nifi Use in Mojaloop Stack #124
Comments
@PaulGregoryBaker presented a proposal for using Apache Nifi in the Mojaloop stack to provide reporting use-cases: Infitx initially created a MongoDB store for events from the events framework as a reporting data store but realised this was not sufficient to give the information required. They introduced the concept of an event processing component to provide greater detail and more data to report on....e.g. double volume of events necessary for reporting on FX transfers. Initial thoughts suggest a MongoDB store may suit all needs; JB questions this given that MongoDB is essentially an unstructured document store with some indexing on top. PB showed a series of UI mockups showing typical use cases. JB suggests these are more buisops (tracing a transfer) than general purpose analytics i.e. asking random questions of a large dataset. PB showed prior thinking that MR raised an issue with, that domain events may be missed leading to inconsistent state. PB shows current thinking. Questions arise around polling approach vs CDC/event driven approach. JB, have we looked at typical opensource analytics pipeline approaches?... MR points out that what is presented appears OK to solve the current issue and that Nifi may be the right piece for a wider analytics pipeline solution. MMH suggests using fulfil and reject kafka messages to trigger reads to the replica. Group accepts this would be worth trying to determine relative performance. JB raises general concern with MongoDB as an all purpose reporting data store suggesting that it has a narrow set of "sweet spot" use-cases, reporting not being one of them. MR says we are talking about two different things: 1. diagnostic reporting e.g. "show me what happened with this transfer" and 2. analytics reporting, being a general set of use-cases; one solution may suit both or we may say we need separate solutions for each. We should not assume one will be suitable. Vijay mentions clickhouse as an example of a general purpose reporting datastore that is column based, vs mongo row based architecture. JB, seems like there is no reason right now not to have Nifi in the architecture, (MR: so long as it is performant enough and accurate/consistent/integral enough) given that wider thinking is needed to determine a solid analytics architecture going forward. JB & MR: what guarantees are there about presenting accurate information to reporting users given there may be unreliable components between the application state store (main MySQL), replicas and the reporting data store? We need to make sure we understand this. PB says this is a detail of the implementation. JB suggests that we should make sure to be fully aware of all the issues in this pipeline that may introduce inconsistency and/or present inaccurate information to system users. MR also raises the upcoming TigerBeetle integration: What will the single source of truth be when that work is complete? TB is likely to hold only a subset of transfer information. Analytics solution needs to be able to properly combine multiple sources of data consistently. JB raises option of KSQL to transform kafka messages to a relational schema. Vijay mentions that retention of kafka topics is costly. Infitx have found kafka retention to be a very large storage issue. MR raises that we have an upcoming increase in the volume of data we will need to hold as a result of ISO-20022 message formats. It is required to ensure any analytics/reporting solution is able to cope with ISO specific data points. JB summarises that the proposed solution is not objected to by anyone present but that more thinking is needed on the wider issues around analytics and reporting. |
No description provided.
The text was updated successfully, but these errors were encountered: