Postgresql 对复杂jsonb内容的全文搜索

Postgresql 对复杂jsonb内容的全文搜索,postgresql,full-text-search,jsonb,postgresql-9.6,Postgresql,Full Text Search,Jsonb,Postgresql 9.6,我有一个非常复杂的jsonb列,包含嵌套的数组和对象。我需要对它进行全文搜索。 json示例: { "buyer": { "email": "1010001419test@ekbseo.ru", "person": { "phone": "1010001419", "taxId": "590202081324", "address": "г Москва, ул Авиаторов, д 34 ", "lastNam

我有一个非常复杂的jsonb列,包含嵌套的数组和对象。我需要对它进行全文搜索。 json示例:

{
"buyer": {
    "email": "1010001419test@ekbseo.ru",
    "person": {
        "phone": "1010001419",
        "taxId": "590202081324",
        "address": "г Москва, ул Авиаторов, д 34 ",
        "lastName": "Зайцева",
        "passport": {
            "issuer": "йцукйцук",
            "deptCode": "123241",
            "issueDate": [
                1111,
                11,
                11
            ],
            "numAndSeries": "0001212810"
        },
        "birthDate": [
            1952,
            2,
            18
        ],
        "firstName": "Зоя",
        "birthPlace": "фывфыв",
        "patronymic": "Антоновна",
        "citizenship": "Россия"
    }
},
"dealNo": "05-0000004",
"created": [
    2017,
    3,
    6
],
"services": [
    "SGR"
],
"transactId": "602032128",
"dealDetails": {
    "secondary": {
        "deposit": 200000,
        "sellers": [
            {
                "bank": {
                    "bic": "044525225",
                    "city": "Москва",
                    "name": "ПУБЛИЧНОЕ АКЦИОНЕРНОЕ ОБЩЕСТВО \"СБЕРБАНК РОССИИ\"",
                    "correspondentAccount": "30101810400000000225"
                },
                "email": "dsfs@sdf.ru",
                "amount": 4800000,
                "person": {
                    "phone": "1234132512",
                    "taxId": "590202081324",
                    "address": "г Москва, ул Марьинский Парк, д 45 стр 1 ",
                    "lastName": "Трутненко",
                    "passport": {
                        "issuer": "",
                        "deptCode": "",
                        "issueDate": [
                            -999999999,
                            1,
                            1
                        ],
                        "numAndSeries": ""
                    },
                    "birthDate": [
                        1111,
                        11,
                        11
                    ],
                    "firstName": "ываыаы",
                    "birthPlace": "фывфыв",
                    "patronymic": null,
                    "citizenship": "Россия"
                },
                "account": "48213412341234234234"
            }
        ],
        "propertyAddress": "г Москва, ул Вавилова, д 19 "
    }
},
"bankContacts": {
    "bankOfficeId": 3561,
    "mortgageManager": {
        "casId": 88928,
        "email": "sbtestmik1@yandex.ru",
        "phone": "79853622342",
        "lastName": "Дзержински",
        "firstName": "Макар",
        "patronymic": "Олегович"
    },
    "mortgageDeptHead": {
        "casId": 88923,
        "email": "sbtestrcik@yandex.ru",
        "phone": "72384798798",
        "lastName": "Михрюткин",
        "firstName": "Валентин",
        "patronymic": "Геннадьевич"
    }
},
"contractInfo": {
    "city": "Москва",
    "price": 5000000,
    "cadastralNum": "65:65:76876:876",
    "contractDate": [
        1111,
        11,
        11
    ]
},
"creditContract": {
    "number": "41221312",
    "ownCapital": 1000000,
    "loanCapital": 4000000
}
}

实际上,我需要在
交易号
买方.个人.电话
买方.个人.地址。**(此处所有文本值)
交易详情.secondary.sellers[](此处所有文本值)
银行联系人(此处所有文本值)
执行此操作的最佳方式是什么


我使用postgresql 9.6

到目前为止,我发现最好的方法是:

  • 创建tsvector返回函数,该函数与我们的json一起工作,如下所示:
  • 创建或替换函数deal\u tsvector(deal\u无文本,数据jsonb)
    将tsvector返回为$$
    开始
    返回到|tsvector(“俄文”,交易编号| | |“| |数据::文本);
    结束
    
  • 按如下方式在其上创建索引:
  • 使用gin(交易向量(交易编号,请求json))在交易上创建索引(如果不存在idx交易fts);
    
    这就是我类似的任务的样子

    DB表如下所示:

     CREATE TABLE sites (
       id text NOT NULL,
       doc jsonb,
       PRIMARY KEY (id)
     )
    
    SITE 
      -> ARRAY OF BUILDINGS
         -> ARRAY OF DEPOSITS
           -> ARRAY OF AUDITS
    
    我们存储在
    doc
    列中的数据是一个复杂的嵌套
    JSONB
    数据:

       {
          "_id": "123",
          "type": "Site",
          "identification": "Custom ID",
          "title": "SITE 1",
          "address": "UK, London, Mr Tom's street, 2",
          "buildings": [
              {
                   "uuid": "12312",
                   "identification": "Custom ID",
                   "name": "BUILDING 1",
                   "deposits": [
                       {
                          "uuid": "12312",
                          "identification": "Custom ID",             
                          "audits": [
                              {
                                 "uuid": "12312",         
                                  "sample_id": "SAMPLE ID"                
                              }
                           ]
                       }
                   ]
              } 
           ]
        }
    
    因此,我的
    JSONB
    的结构如下所示:

     CREATE TABLE sites (
       id text NOT NULL,
       doc jsonb,
       PRIMARY KEY (id)
     )
    
    SITE 
      -> ARRAY OF BUILDINGS
         -> ARRAY OF DEPOSITS
           -> ARRAY OF AUDITS
    
    我们需要通过每种类型的条目中的某些值来实现全文搜索:

    SITE (identification, title, address)
    BUILDING (identification, name)
    DEPOSIT (identification)
    AUDIT (sample_id)
    

    SQL查询应仅在这些字段值中运行全文搜索

    “asm0dey?成功了吗?”rusllonrails看着我自己的答案,请回答。我也有类似的任务,但对PG没有太多经验。你能不能也分享一个例子,你是如何为嵌套的JSONB结构添加自定义PG函数和索引的(例如:如果你需要通过dealDetails.secondary.sellers.bank.bic值进行全文搜索,这种方法会起作用),如你所见,我们只是将数据转换为文本,所以tsvector可以处理整个json内容,当然,它不会处理单个字段,但是没有什么可以阻止您为json的单个字段创建自定义索引。嘿,伙计,谢谢。快速提问:你使用这些东西的应用程序是私有的还是公共的?询问,因为如果可能的话,您希望看到任务的完整实现)无论如何,非常感谢。@rusllonrails这是私有的,抱歉。ok@asm0dey np)也许您可以分享您完成的一个嵌套字段的完整示例。例如
    dealDetails.secondary.sellers.anyfield
    ,您是如何索引它的,然后是如何查询它的。