[Herddb-dev] R: Herddb entry on dbdb.io

Enrico Olivelli eolivelli at gmail.com
Fri Aug 23 23:25:32 CEST 2019


Il giorno mar 2 lug 2019 alle ore 10:51 Alessandro Luccaroni - Diennea <
alessandro.luccaroni at diennea.com> ha scritto:

> Regarding the open points:
>
> Checkpoint = the “Consistent” option is not clear or the doc link provided
> is not clear? dbdb.io provide the following options: “Blocking”,
> “Consistent”, “Fuzzy”, “Non-Blocking”, “Not Supported”. Since the
> checkpoint freeze the DB (and in a checkpoint there are all the info needed
> to bring up a replica), I think it should be “Consistent” + “Blocking” (see
> PostgreSQL entry for some background around “Consistent”
> https://dbdb.io/db/postgresql)
>

We are blocking only writes, but Consistent + Blocking is okay


>
>
> Joins = dbdb.io provide the following options: “Broadcast Join”, “Has
> Join”, “Index Nested Loop Join”, “Nested Loop Join”, “Not Supported”, “Semi
> Join”, “Shuffle Join”, “Sort-Merge Join”. Which ones do we choose? Is it
> going to change after the Calcite 1.20 upgrade?
>
Nested Loop, Sort-Merge-Join (but actually there is some bug in Calcite and
it is never used), Hash Join


>
>
> Storage Architecture = it’s related to where the data is saved (disk,
> memory or both)…since we are planning about
> https://github.com/diennea/herddb/issues/401 I think our use case is
> “Hybrid” (it means that you can choose between the two in dbdb.io
> terminology)
>
I would say "disk"


>
>
> Storage Model = "N-ary storage model” means that the column are stored
> together in a row-by-row basis (like a very wide csv?). I think we can
> safely say only “Key/Value” in our case
>
We are saving all of the columns of a row into the same page, all packed



>
>
> Storage Organization = I will keep both "Log-structured" and "Heaps" since
> we store data both on data file and on the Bookie log
>
Mostly like "Log-structured"

>
>
> Concurrency Control = “Deterministic Concurrency Control “: is this
> guaranteed by Bookkeeper? Can you point me in the code and/or documentation
> where we can extrapolate that?
>
Concurrency Control....we are using "pessimistic row level locking", before
accessing to a record the client acquires a lock (read or write).
Each Transaction that modifies a record holds the new copy of the record in
a local buffer, and this new version is not visible to other transactions
until the tx is committed.
BookKeeper is out of the game, we are using it only for the WAL


>
>
> Query Compilation = “Code Generation”: is this related to our usage of
> Apache Calcite?
>
We are not "compiling" code, we have an access plan that is prepared by
Apache Calcite and then translated to an internal representation.


>
> Query Execution = “Vectorized Model”: is this related to our usage of
> Apache Calcite?
>
Where did you find the list of available values ?
I feel "Vectorized Model" is not out option



Cheers
Enrico



>
>
> *Alessandro Luccaroni*
> Platform Manager @ Diennea - MagNews
> Tel.: (+39) 0546 066100 Int. 924 - Mob.: (+39) 393 7273519
> Viale G.Marconi 30/14 - 48018 Faenza (RA) - Italy
>
>
>
> *Da:* herddb-dev-bounces at lists.herddb.org [mailto:
> herddb-dev-bounces at lists.herddb.org] *Per conto di *Enrico Olivelli
> *Inviato:* martedì 2 luglio 2019 00:02
> *A:* Herddb developers <herddb-dev at lists.herddb.org>
> *Oggetto:* Re: [Herddb-dev] Herddb entry on dbdb.io
>
>
>
> Great idea Alessandro,
>
> some comments inline below
>
>
>
> Il gio 27 giu 2019, 16:09 Alessandro Luccaroni - Diennea <
> alessandro.luccaroni at diennea.com> ha scritto:
>
> Hi all,
>
> I was thinking about sending an email to Carnagie Mellon Database Group
> about the entry of Herddb on https://dbdb.io/db/herddb
>
>
>
> I’ve grouped up a bunch of information about Herd, can you check if
> everything seems correct?
>
>
>
> Checkpoints =  "Consistent" (
> https://github.com/diennea/herddb/wiki/Data-storage
> https://github.com/diennea/herddb/wiki/Checkpoints-configuration)
>
> This is not clear to me.
>
> We are always respecting ACID properties. And this is not related to
> checkpoints.
>
>
>
>
>
> Foreign Keys = "Not Supported" (
> https://github.com/diennea/herddb/wiki/SQL-Support)
>
>
>
> Ok
>
> Data Model = "Relational"  (
> https://github.com/diennea/herddb/wiki/SQL-Support)
>
> Ok (relational usually means 'with tables')
>
> Indexes =  "B-Link" "BRIN" (
> https://github.com/diennea/herddb/blob/master/herddb-utils/src/main/java/herddb/index/blink/BLink.java
> https://github.com/diennea/herddb/tree/master/herddb-core/src/main/java/herddb/index/brin
> )
>
> Our own BRIN is not strictly speaking the official BRIN you can find in
> literature, so maybe it is better to write something like BRIN-like
>
>
>
>
>
> Isolation Levels = "Read Committed" (
> https://github.com/diennea/herddb/wiki/SQL-Support)
>
> Ok
>
>
>
> Joins = "Nested Loop join" (
> https://github.com/diennea/herddb/wiki/SQL-Support)
>
>
>
> This is not true indeed.
>
> IIRC we are supporting several kinds of join, driven by Apache Calcite.
> Maybe the most common join you will find on simple plan is the hash join,
> but it really depends on data and on the planner.
>
> We can write Joins -supported various types, as driven by Apache Calcite
> SQL Planner
>
>
>
> Query Interface = "SQL" "Command-line/Shell" (
> https://github.com/diennea/herddb/wiki/SQL-Support)
>
> JDBC, proprietary API and command line
>
> Storage Architecture = "Hybrid"
>
> Not sure what it means
>
> Storage Model = "Key/Value" "N-ary Storage Model (Row/Record)" (
> https://github.com/diennea/herddb/wiki/Data-storage)
>
> I don't know. HerdDB is mostly a key-value store, on top of which we have
> built an sql engine
>
> Storage Organization = "Log-structured" "Heaps" (
> https://github.com/diennea/herddb/wiki/Data-storage)
>
> Something like that
>
> Stored Procedures = "Not Supported" (
> https://github.com/diennea/herddb/wiki/SQL-Support)
>
> Ok
>
> System Architecture =  "Shared-Nothing" (
> https://github.com/diennea/herddb/wiki/Replication)
>
> Ok
>
> Views = "Not Supported" (
> https://github.com/diennea/herddb/wiki/SQL-Support)
>
> Ok
>
>
>
> I’m still in doubt about some other definition, see below with some
> “option” using the dbdb.io “nomenclature”:
>
>
>
> “Concurrency Control”
>
> 1)      Deterministic Concurrency Control
>
>
>
> Maybe this one
>
>
>
> 2)      Multi-version Concurrency Control (MVCC)
>
> 3)      Optimistic Concurrency Control (OCC)
>
> 4)      Timestamp Ordering
>
> 5)      Two-Phase Locking (Deadlock Detection)
>
> 6)      Two-Phase Locking (Deadlock Prevention)
>
>
>
> “Query Compilation”
>
> 1)      Code Generation
>
> Something like that
>
> 2)      JIT Compilation
>
> 3)      Not Supported
>
> 4)      Stored Procedure Compilation
>
>
>
> “Query Execution”
>
> 1)      Materialized Model
>
> 2)      Tuple-at-a-Time Model
>
>
>
> Maybe this one
>
> 3)      Vectorized Model
>
>
>
> But we could also use some terminology that is not currently cover (for
> example currently there are no DBMS mapped with either BRIN or B-Link
> indexes, but we support them).
>
>
>
> *Alessandro Luccaroni*
> Platform Manager @ Diennea - MagNews
> Tel.: (+39) 0546 066100 Int. 924 - Mob.: (+39) 393 7273519
> Viale G.Marconi 30/14 - 48018 Faenza (RA) - Italy
>
>
>
>
> ------------------------------
>
>
> CONFIDENTIALITY & PRIVACY NOTICE
> This e-mail (including any attachments) is strictly confidential and may
> also contain privileged information. If you are not the intended recipient
> you are not authorised to read, print, save, process or disclose this
> message. If you have received this message by mistake, please inform the
> sender immediately and destroy this e-mail, its attachments and any copies.
> Any use, distribution, reproduction or disclosure by any person other than
> the intended recipient is strictly prohibited and the person responsible
> may incur in penalties.
> The use of this e-mail is only for professional purposes; there is no
> guarantee that the correspondence towards this e-mail will be read only by
> the recipient, because, under certain circumstances, there may be a need to
> access this email by third subjects belonging to the Company.
>
> _______________________________________________
> herddb-dev mailing list
> herddb-dev at lists.herddb.org
> http://lists.herddb.org/mailman/listinfo/herddb-dev
>
>
> ------------------------------
>
> CONFIDENTIALITY & PRIVACY NOTICE
> This e-mail (including any attachments) is strictly confidential and may
> also contain privileged information. If you are not the intended recipient
> you are not authorised to read, print, save, process or disclose this
> message. If you have received this message by mistake, please inform the
> sender immediately and destroy this e-mail, its attachments and any copies.
> Any use, distribution, reproduction or disclosure by any person other than
> the intended recipient is strictly prohibited and the person responsible
> may incur in penalties.
> The use of this e-mail is only for professional purposes; there is no
> guarantee that the correspondence towards this e-mail will be read only by
> the recipient, because, under certain circumstances, there may be a need to
> access this email by third subjects belonging to the Company.
> _______________________________________________
> herddb-dev mailing list
> herddb-dev at lists.herddb.org
> http://lists.herddb.org/mailman/listinfo/herddb-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.herddb.org/pipermail/herddb-dev/attachments/20190823/e0da73d2/attachment-0001.html>


More information about the herddb-dev mailing list