Json 如何在scala中过滤嵌套列表和映射

Json 如何在scala中过滤嵌套列表和映射,json,scala,dictionary,filter,Json,Scala,Dictionary,Filter,我试图读取一个json文件,以便在scala中计算一些指标。我设法读取了文件并进行了一些外部筛选,但在理解如何筛选嵌套列表和映射时遇到了困难 下面是一个示例代码(真正的json更长): 我设法得到了一个包含3个标题的列表: import scala.util.parsing.json._ val parsedData = JSON.parseFull(data) val listTitles = parsedData.get.asInstanceOf[List[Map[String, Any]]

我试图读取一个json文件,以便在scala中计算一些指标。我设法读取了文件并进行了一些外部筛选,但在理解如何筛选嵌套列表和映射时遇到了困难

下面是一个示例代码(真正的json更长):

我设法得到了一个包含3个标题的列表:

import scala.util.parsing.json._
val parsedData = JSON.parseFull(data)
val listTitles = parsedData.get.asInstanceOf[List[Map[String, Any]]].map( { case e: Map[String, Any] => e("title").toString }  )
以下是我的三个问题:

  • 这是获得3个标题列表的好方法吗
  • 如何获取一个列表,该列表包含每个用户的付费用户数 后三个标题
  • 如何获取包含已访问的用户数的列表 完成后3个题目的课程

  • 提前感谢您的帮助

    您可以使用play json库解析和检索所需字段。例如:

    import play.api.libs.json.Json
    
    val rawData1 = Json.parse("""[{"technology":"C","users":[{"rating":5,"completed":false,"user":{"id":11111,"paid":true}},{"rating":4,"completed":false,"user":{"id":22222,"paid":false}}],"title":"CS50"},{"technology":"C++","users":[{"rating":3,"completed":true,"user":{"id":33333,"paid":false}},{"rating":5,"completed":true,"user":{"id":44444,"paid":false}}],"title":"Introduction to C++"},{"technology":"Haskell","users":[{"rating":5,"completed":false,"user":{"id":55555,"paid":false}},{"rating":null,"completed":true,"user":{"id":66666,"paid":false}}],"title":"Course on Haskell"}]""")
    
    val resultedList = (rawData1 \\ "title").toList.map(_.as[String])
    

    正如另一个答案所建议的,您应该使用play json库。它真的很强大,有很多功能,包括对象映射、解析和错误处理

      import play.api.libs.json._
      import play.api.data.validation.ValidationError
    
      case class User(id: String, paid: Boolean)
      object User {
        implicit val format: OFormat[User] = Json.format[User]
      }
    
      case class UserCourseStat(rating: Int, completed: Boolean, user: User)
      object UserCourseStat {
        implicit val format: OFormat[UserCourseStat] = Json.format[UserCourseStat]
      }
    
      case class Data(technology: String, title: String, users: List[UserCourseStat])
      object Data {
        implicit val format: OFormat[Data] = Json.format[Data]
      }
    
      val jsString = """[{"technology":"C","users":[{"rating":5,"completed":false,"user":{"id":11111,"paid":true}},{"rating":4,"completed":false,"user":{"id":22222,"paid":false}}],"title":"CS50"},{"technology":"C++","users":[{"rating":3,"completed":true,"user":{"id":33333,"paid":false}},{"rating":5,"completed":true,"user":{"id":44444,"paid":false}}],"title":"Introduction to C++"},{"technology":"Haskell","users":[{"rating":5,"completed":false,"user":{"id":55555,"paid":false}},{"rating":null,"completed":true,"user":{"id":66666,"paid":false}}],"title":"Course on Haskell"}]"""
    
      val rowData: JsValue = Json.parse(jsString)
    
      rowData.validate[List[Data]] match {
        case JsSuccess(dataList: List[Data], _) =>
          val chosenTitles = List("Course on Haskell", "Introduction to C++", "CS50")
    
          //map of each chosen title to sequence of it's users
          val chosenTitleToUsersMap = chosenTitles.map { title =>
            title -> dataList.filter(_.title == title)
              .flatMap(_.users.map(_.user))
              .toSet
          }.toMap
          //map of each chosen title to sequence of it's paid users
          val chosenTitleToPaidUsersMap = chosenTitleToUsersMap.map { case (title, users) =>
            title -> users.filter(_.paid)
          }
    
          //Calculate users who have completed each of the chosen title
          val allUsers = dataList.flatMap(_.users.map(_.user)).toSet
    
          val usersWhoCompletedAllChosenTitles = allUsers.filter{ user =>
            chosenTitles.forall { title =>
              chosenTitleToUsersMap.get(title).flatten.contains(user)
            }
          }
    
        case JsError(errors: Seq[(JsPath, Seq[ValidationError])]) =>
          //handle the error case
          ???
      }
    
    关于你提出的3个问题:

  • 这是获得3个标题列表的好方法吗
  • 我可以看到两个不安全的操作,一个是instanceof,一个是e(“title”),后一个是因为没有使用Map的.get(key)方法,如果找不到key,它会抛出异常

  • 如何获取包含后3种标题的付费用户数量的列表
  • 在名为“ChosentitleteTopaidUsersMap”的val中进行上述评估

  • 如何获得一份列表,其中包含完成后3个标题中每个标题的课程的用户数量
  • 在上面名为“userswhocompletedallchosentiles”的val中进行了评估,我建议您使用该库。它允许您将数据提取到案例类中:

    import org.json4s.jackson.JsonMethods.parseOpt
    import org.json4s.DefaultFormats
    implicit val formats = DefaultFormats
    
    case class Tech(technology: String, users: Seq[TechUser], title: String)
    case class TechUser(rating: Option[Int], completed: Boolean, user: UserInfo)
    case class UserInfo(id: Int, paid: Boolean)
    
    val rawData = """..."""
    val Some(json) = parseOpt(rawData)
    val Some(data) = json.extractOpt[List[Tech]]
    
    完成此操作后,
    data
    是一个常规的Scala数据结构,您可以根据需要进行操作。例如,如果要查找id可被5整除的用户的标题,可以这样做:

    data.find(_.users.exists(_.user.id % 5 == 0)).map(_.title)
    // Result: Some("Course on Haskell")
    

    这三个问题的答案是一行代码,就像这一行一样,但我把它们留给您作为练习。

    您不应该再使用Scala的json解析器。有更好的本地人,例如。如果你还想走石路,请看这里:
    data.find(_.users.exists(_.user.id % 5 == 0)).map(_.title)
    // Result: Some("Course on Haskell")