Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/clojure/3.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Clojure 在谓词真值测试更改后对惰性序列进行分区_Clojure_Functional Programming_Clojurescript - Fatal编程技术网

Clojure 在谓词真值测试更改后对惰性序列进行分区

Clojure 在谓词真值测试更改后对惰性序列进行分区,clojure,functional-programming,clojurescript,Clojure,Functional Programming,Clojurescript,考虑以惰性顺序存储的句子:每个单词都是一个条目,但标点符号属于单词: ("It's" "time" "when" "it's" "time!" "What" "did" "you" "say?" "Nothing!") 现在应该用句子来“划分”。我编写了一个助手函数last particted?,它检查最后一个字符是否是非字母字符。(这没问题) 预期结果: (("It's" "time" "when" "it's" "time!") ("What" "did" "you" "say?") ("

考虑以惰性顺序存储的句子:每个单词都是一个条目,但标点符号属于单词:

("It's" "time" "when" "it's" "time!" "What" "did" "you" "say?" "Nothing!")
现在应该用句子来“划分”。我编写了一个助手函数last particted?,它检查最后一个字符是否是非字母字符。(这没问题)

预期结果:

(("It's" "time" "when" "it's" "time!") ("What" "did" "you" "say?") ("Nothing!"))

一切都应该保持懒惰。不幸的是,我不能使用partition by:这个函数在给定谓词的结果改变之前分裂,这意味着带标点的条目不会被解释为子序列中的最后一个条目。

当输入的大小与输出的大小不同时,答案通常是使用
reduce

(defn last-word? [word]
  (assert word)
  (or (.endsWith word "!")
      (.endsWith word "?")))

(defn make-sentence [in]
  (reduce (fn [acc ele]
            (let [up-to-current-sentence (vec (butlast acc))
                  last-word-last-sentence (-> acc last last)
                  new-sentence? (when last-word-last-sentence (last-word? last-word-last-sentence))
                  current-sentence (vec (last acc))]
              (if new-sentence?
                (conj acc [ele])
                (conj up-to-current-sentence (conj current-sentence ele)))))
          [] in))

不幸的是,
reduce
需要结束,因此无法使用惰性输入。有讨论。

我建议使用
惰性seq
。没有比这更好的了(也许这不是最好的):

答复:

user> (let [items '("It's" "time" "when" "it's"
                    "time!" "What" "did" "you"
                    "say?" "Nothing!")]
        (parts items (comp #{\? \! \. \,} last)))

(("It's" "time" "when" "it's" "time!") ("What" "did" "you" "say?") ("Nothing!"))

user> (let [items '("what?" "It's" "time" "when" "it's"
                    "time!" "What" "did" "you"
                    "say?" "Nothing!")]
        (parts items (comp #{\? \! \. \,} last)))

(("what?") ("It's" "time" "when" "it's" "time!") ("What" "did" "you" "say?") ("Nothing!"))

user> (let [items '("what?" "It's" "time" "when" "it's"
                    "time!" "What" "did" "you"
                    "say?" "Nothing!")]
        (realized? (parts items (comp #{\? \! \. \,} last))))

false
更新:可能与
迭代相同的方法会更好

(defn parts [items pred]
  (->> [nil items]
       (iterate (fn [[_ items]]
                  (let [[l r] (split-with (complement pred) items)]
                    [(concat l (take 1 r)) (rest r)])))
       rest
       (map first)
       (take-while seq)))

通过生成一个新序列,包含“分割标记”,然后根据不同的谓词进行
分区,实际上可以很容易地表达这个问题

(def punctuation? #{\. \! \?})

(def words ["It's" "time" "when" "it's" "time!" "What" "did" "you" "say?" "Nothing!"])

(defn partition-sentences [ws]
  (->> ws
    (mapcat #(if (punctuation? (last %)) [% :br] [%]))
    (partition-by #(= :br %))
    (take-nth 2)))


(println (take 20 (partition-sentences (repeatedly #(rand-nth words))))

reduce
一点也不懒惰。。。所以我猜这不是op想要的东西。我只是在考虑这个事实。即将查找是否可以使
reduce
变为懒惰。你可能知道
reduce
似乎是这个问题的“目标”。如果能让它变懒就好了。看起来不错。。我现在无法深入研究它。。。但是,只是:你确定那最后还是懒惰吗?啊,还有:一个改进:有一个mapcat,它将使扁平化消失。我已经按照你的建议改为使用
mapcat
。更新后的示例显示,这永远不会强制实现整个序列。感谢您的努力,我喜欢这种方法的创造性。它工作得很好。最后,我决定使用leetwinski建议的解决方案之一(iterate/lazy-seq),因为我发现“stopper”的使用有点太粗糙,但这是我个人的喜好。就简单性而言,我最喜欢第一个(lazy-seq)版本。对我来说,这很清楚,它没有开销,迭代必须采取:比如:用nil初始化,其余的/(映射优先)。问题是:为什么您认为迭代方法会更好?它是否与递归和堆栈跟踪有关?(与循环/重现类似)
(def punctuation? #{\. \! \?})

(def words ["It's" "time" "when" "it's" "time!" "What" "did" "you" "say?" "Nothing!"])

(defn partition-sentences [ws]
  (->> ws
    (mapcat #(if (punctuation? (last %)) [% :br] [%]))
    (partition-by #(= :br %))
    (take-nth 2)))


(println (take 20 (partition-sentences (repeatedly #(rand-nth words))))