Clojure:在Java字节数组中查找字节序列

Clojure:在Java字节数组中查找字节序列,java,arrays,clojure,Java,Arrays,Clojure,有人能告诉我如何在java字节数组中找到两个字节的序列吗。我要查找的两个字节是:FF DA(十六进制) 看起来java将一个字节的范围映射到了-127到128。我本以为会有一个0-255的范围 以下是一个建议: (->> (partition 2 byte-array) (keep-indexed (fn [i ab] (when (= ab [(byte 0xFF) (byte 0xDA)])))) first) 但由

有人能告诉我如何在java字节数组中找到两个字节的序列吗。我要查找的两个字节是:FF DA(十六进制) 看起来java将一个字节的范围映射到了-127到128。我本以为会有一个0-255的范围

以下是一个建议:

(->> (partition 2 byte-array)
     (keep-indexed (fn [i ab]
                     (when (= ab [(byte 0xFF) (byte 0xDA)]))))
     first)
但由于负范围,这不起作用。此外,使用这些高级函数和惰性序列的开销很可能不会出现在这里

我的实际用例是在JPG图像的字节数组中找到图像数据开始(元数据部分停止)的位置。

将强制转换为字节。或者,您可以将十六进制转换为有符号字节(-128到127)


更新的解决方案 这里有一个更好的解决问题的方法。虽然我通常不太喜欢使用
nil
,我的意思是
false
,但内置函数
keep indexed
很好地结合了keep=true/false的决定和keep=false时丢弃所有值的缩减步骤

(s/defn find-patterns :- [s/Int]
  [pattern-vec :- tsk/List
   data-vec :- tsk/List]
  (let [parts         (partition (count pattern-vec) 1 data-vec)
        idxs          (keep-indexed
                        (fn [idx candidate]
                          (when (= candidate pattern-vec)
                            idx))
                        parts)]
    idxs))

(s/defn find-pattern :- s/Int
  [pattern  :- tsk/List
   data     :- tsk/List ]
  (first (find-patterns pattern data)))

(deftest t-find-pattern
;              0 1 2  3    4    5    6    7   8 9]
  (let [data [ 0 1 2 0xAA 0xFA 0xFF 0xDA 0xDD 8 9] ]
    (is= 5 (find-pattern  [0xFF 0xDA] data))))
原液 这里有一个方法。使用
spy let
可以直观地看到以下步骤:

(ns tst.clj.core
  (:require [tupelo.core :as t]))
(t/refer-tupelo)

;           0 1 2  3    4    5    6    7   8 9]
(def data [ 0 1 2 0xAA 0xFA 0xFF 0xDA 0xDD 8 9] )

(defn find-pattern
  [pattern data]
  (spy-let [
    patt-matches?   (fn [idx tst-pat] [idx (= tst-pat pattern) ] )
    parts           (partition (count pattern) 1 data)
    idx-labelled    (map-indexed patt-matches? parts)
    idx-matches?    (fn [[idx matches]] (= true matches))
    idx-that-match  (filter idx-matches? idx-labelled)
    result          (first (first idx-that-match))
  ]
    result
  )
)
(spyx (find-pattern [0xFF 0xDA] data))
结果:

parts => ((0 1) (1 2) (2 170) (170 250) (250 255) (255 218) (218 221) (221 8) (8 9))

idx-labelled => ([0 false] [1 false] [2 false] [3 false] [4 false] [5 true] [6 false] [7 false] [8 false])

idx-that-match => ([5 true])

result => 5

(find-pattern [255 218] data) => 5

不知道您的数据格式,只需将其放入Clojure向量中,然后就不用担心有符号字节与无符号字节。

我将使用类似的方法将数字转换为字节:

user> 
(defn as-byte [^long n]
  {:pre [(<= 0 n 255)]}
  ^byte (unchecked-subtract n 256))
#'user/as-byte

user> (as-byte 0xff)
;;=> -1

user> (as-byte 0xda)
;;=> -38

user> (as-byte 0xfff)
AssertionError Assert failed: (<= 0 n 255)  user/as-byte (form-init2939145917481178115.clj:259)
答复:

user> (find-bytes-idx (byte-array (map as-byte [0xcd 0xef]))
                      (byte-array (map as-byte [0xaa 0xab 0xcd 0xcd 0xef 0xdf])))

;=> 3
user> (find-bytes-idx (byte-array (map as-byte [0xcd 0xef]))
                      (byte-array (map as-byte [0xaa 0xab 0xcd 0xcd 0xee])))

;=> -1
user> 
(defn as-byte [^long n]
  {:pre [(<= 0 n 255)]}
  ^byte (unchecked-subtract n 256))
#'user/as-byte

user> (as-byte 0xff)
;;=> -1

user> (as-byte 0xda)
;;=> -38

user> (as-byte 0xfff)
AssertionError Assert failed: (<= 0 n 255)  user/as-byte (form-init2939145917481178115.clj:259)
(defn find-bytes-idx [^bytes look-for ^bytes data]
  (let [search-len (alength look-for)
        diff (- (alength data) search-len)]
    (if (neg? diff)
      -1
      (loop [i 0]
        (cond (> i diff) -1
              (Arrays/equals look-for (Arrays/copyOfRange data i (+ i search-len))) i
              :else (recur (inc i)))))))
user> (find-bytes-idx (byte-array (map as-byte [0xcd 0xef]))
                      (byte-array (map as-byte [0xaa 0xab 0xcd 0xcd 0xef 0xdf])))

;=> 3
user> (find-bytes-idx (byte-array (map as-byte [0xcd 0xef]))
                      (byte-array (map as-byte [0xaa 0xab 0xcd 0xcd 0xee])))

;=> -1