問題描述
Cách hiệu quả nhất để lưu trữ các cặp tên / giá trị trong cơ sở dữ liệu Marklogic là gì (What is the most efficient way to store name/value pairs in a Marklogic database)
My application often needs to decorate values in the documents it serves using a lookup take to fetch human readable forms of various codes.
For example <product_code>PC001</product_code>
would want to be returned as <product_code code='PC001'>Widgets</product_code>
. It's not always product_code; there are a few different types of code that need similar behaviour (some of them having just a few dozen examples, some of them a few thousand.)
What I want to know is what is the most efficient way to store that data in the database? I can think of two possibilities:
1) One document per code type, with many elements:
<product‑codes>
<product‑code code = "PC001">Widgets</product‑code>
<product‑code code = "PC002">Wodgets</product‑code>
<product‑code code = "PC003">Wudgets</product‑code>
</product‑codes>
2) One document per code, each containing a <product‑code>
element as above.
(Obviously, both options would include sensible indexes)
Is either of these noticeably faster than the other? Is there another, better option?
My feeling is that it's generally better to keep one 'thing' per document since it's conceptually slightly cleaner and (I understand) better suited to ML's indexing, but in this case that seems like it would lead to a very large number of very small files. Is that something I should worry about?
參考解法
方法 1:
Anything that needs to be searched independently should be its own document or fragment. However, if you are just doing lookups then an element attribute range index should be very fast at returning values:
element‑attribute‑range‑query(xs:QName('product‑code'), xs:QName('code'), '=', 'PC001')
=>
Widgets
Using a range index the lookups will all occur from the same index regardless of how you chunk the documents. So unless you will need to use cts:search on product‑code
to retrieve the actual elements, it shouldn't matter how you chunk the documents.
方法 2:
Another approach is to store a map that represents the name‑value pairs.
let $m := map:map()
let $_ := map:put($m, 'a', 'fubar')
return document { $m }
This returns an XML representation of the hashmap, which can be stored directly in the database using xdmp:document‑insert
. You can turn an XML map back into a native map using map:map
as a constructor function. The native map could also be memoized using xdmp:set‑server‑field
.
(by Will Goring、wst、mblakele)