I am building a local analytics data lake to collect, store and analyze large sets of data. Some of this will be for archiving purposes some will be constantly accessed. Historically I would have used MYSQL and then BigQuery for analytics.
My Question. Is the Local Big Data Environment a valid alternative to a local MYSQL install?
@nxfxcom technically you could use the local big data environment and store data as CSV, ORC or parquet files underneath so you could also access them individually even if the environment is no longer there. You can get an impression here:
Blut I would advise against it since big data technology would make the most sense when used over several nodes and I think the KNIME implementation of Hive might be more for educational purposes than a fully maintained productive environment. For storing large amounts of data in a local environment systems like MySQL (or Postgres or MariaDB) might be better suited.
If you are looking for a single-file solution you might take a look at H2 (how to get the latest H2 driver) or SQLite but they might have some limitations when it comes to very large data sets.
This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.