問題描述
為什麼 PyMongo 將 uuid.uuid1() 編碼為 BSON::Binary? (Why does PyMongo encode uuid.uuid1() as a BSON::Binary?)
I'm adding a 'GUID' key with a value of uuid.uuid1() (from python uuid module) for all my documents in Mongo. I noticed they are being stored not as strings, but as type BSON::Binary. I've done some Googling already, but I still don't understand what the purpose/advantage to this serialization is. Can someone explain? Should I be converting the uuid.uuid1() to strings before storing? How can I use a string to find() by the GUID value like db.myCol.find({ 'GUID' : aString })?
參考解法
方法 1:
The default serialization for a Python uuid
uses a UUID
binary representation in the BSON spec because this ensures consistent sorting for range queries, and also uses less storage for data/indexes.
For example, these three strings are equivalent in hex:
5d78ad35ea5f11e1a183705681b29c47
5D78AD35EA5F11E1A183705681B29C47
5d78ad35ea5f11e1A183705681B29C47
..but have different sort orders as strings:
> db.uuidsort.find().sort({_id:1})
{ "_id" : "5D78AD35EA5F11E1A183705681B29C47" }
{ "_id" : "5d78ad35ea5f11e1A183705681B29C47" }
{ "_id" : "5d78ad35ea5f11e1a183705681b29c47" }
Comparing the bson sizes:
> db.uuidtest.find()
{ "_id" : BinData(3,"XXitNepfEeGhg3BWgbKcRw==") }
{ "_id" : "5d78ad35ea5f11e1a183705681b29c47" }
> Object.bsonsize(db.uuidtest.findOne({_id: BinData(3,"XXitNepfEeGhg3BWgbKcRw==")}))
31
> Object.bsonsize(db.uuidtest.findOne({_id: "5d78ad35ea5f11e1a183705681b29c47"}))
47
If you do want to insert as strings, you can use UUID.hex to get the 32-character string equivalent:
>>> db.uuidtest.insert({'_id': uuid.hex})
'5d78ad35ea5f11e1a183705681b29c47'
If you want to find UUIDs by string from Python, you can use the uuid.UUID methods:
>>> db.uuidtest.find_one({'_id':uuid.UUID('5d78ad35ea5f11e1a183705681b29c47')})
{u'_id': UUID('5d78ad35-ea5f-11e1-a183-705681b29c47')}
If you want to find UUIDs by string from the mongo
shell, there is a UUID()
helper:
> db.uuidtest.find({_id:UUID('5d78ad35ea5f11e1a183705681b29c47')})
{ "_id" : BinData(3,"XXitNepfEeGhg3BWgbKcRw==") }
Note: there are a few other UUID subtypes available for interoperability with other driver versions, as described in the API docs for bson.binary.