Marshmallow,一个有💡点甜的Python库
前言
在许多场景中,我们常常需要执行Python
对象的序列化、反序列化操作。例如,在开发REST API
时,或者在进行一些面向对象化的数据加载和保存时,这一功能经常派上用场。
经常cv Python
代码的臭宝,接触最多的应该是通过json
、pickle
模块进行序列化或反序列化,这是一种常见的做法。
import json
data = {'name': 'John', 'age': 30, 'city': 'New York'}
serialized_data = json.dumps(data)
往往Python
对象的序列化、反序列化同时也要伴随着数据的处理和校验。
而今天要自我介绍的主角:Marshmallow
,则为我们带来更强大的数据序列化和反序列化,更优雅的参数校验、数据处理能力。
Github地址:github.com/marshmallow…
它可以帮助您将复杂的数据类型(例如对象或数据库记录)转换为Python
数据类型(例如字典),反之亦然。
它被大量用于
Flask
和Django
等Web
开发框架中,以处理数据输入和输出。
楔子
为了执行序列化或反序列化操作,首先需要一个操作对象。在这里,我们先定义一个类:
class Novel:
def __init__(self, title, author, genre, pages):
self.title = title
self.author = author
self.genre = genre
self.pages = pages
# 实例化一个小说对象
interesting_novel = Novel(title="The Enchanting Adventure", author="Jane Doe", genre="Fantasy", pages=300)
现在的需求是将这个小说对象转成字典。你会怎么来实现呢?
笨方法
- 手动创建一个字典,将小说对象的属性映射到字典的键和值。
novel_dict = {
"title": interesting_novel.title,
"author": interesting_novel.author,
"genre": interesting_novel.genre,
"pages": interesting_novel.pages
}
- 使用
vars
函数:
Python 中的 vars 函数可以返回对象的 dict 属性,该属性包含了对象的所有属性和对应的值。这样,你可以直接使用 vars 将对象转换为字典:
novel_dict = vars(interesting_novel)
- 使用
__dict__
属性:
对于具有__dict__
属性的对象,可以直接访问该属性获取对象的属性和值。
novel_dict = interesting_novel.__dict__
- 使用
json.loads(json.dumps(obj))
:
利用JSON
库,通过先将对象转换为JSON字符串
,然后再将其解析为字典。
import json
novel_json = json.dumps(interesting_novel, default=lambda o: o.__dict__)
novel_dict = json.loads(novel_json)
数据类使用dataclass
/attrs
的内置方法
dataclass
版本
from dataclasses import dataclass, asdict
from loguru import logger
@dataclass
class Novel:
title: str
author: str
genre: str
pages: int
# 实例化一个小说对象
interesting_novel = Novel(title="The Enchanting Adventure", author="Jane Doe", genre="Fantasy", pages=300)
# 将对象序列化为字典
novel_dict = asdict(interesting_novel)
logger.info(novel_dict)
# 将字典反序列化为对象
new_novel = Novel(**novel_dict)
logger.info(new_novel)
attrs
版本
import attr
import cattr
from loguru import logger
@attr.define
class Novel:
title = attr.ib()
author = attr.ib()
genre = attr.ib()
pages = attr.ib()
# 实例化一个小说对象
interesting_novel = Novel(title="The Enchanting Adventure", author="Jane Doe", genre="Fantasy", pages=300)
# 将对象序列化为字典
novel_dict = cattr.unstructure(interesting_novel)
logger.info(novel_dict)
# 将字典反序列化为对象
new_novel_dict = {'title': 'AI之旅', 'author': 'HaiGe', 'genre': 'Fantasy', 'pages': 668}
new_novel = cattr.structure(new_novel_dict, Novel)
logger.info(new_novel)
更优雅的方案:marshmallow
上面介绍的几种序列化与反序列化方法看起来已经相当炸裂,但是为什么还要选择 marshmallow
呢?
其实,尽管这些方法能够完成基本的序列化和反序列化任务,但在处理更加复杂的数据结构、数据验证、预处理逻辑以及与其他库的集成
等方面,它们可能显得不够灵活和方便。而 marshmallow
库正是为了解决这些问题而诞生的。
marshmallow
提供了强大而灵活的schema(模式)
定义,可以精确地控制数据的序列化和反序列化过程。它支持复杂的数据结构、自定义验证器、预处理逻辑等高级功能,同时与许多其他常见的Python
库和框架无缝集成。
无论是构建
RESTful API
、数据持久化、数据迁移或者简单的数据处理、数据验证等领域,marshmallow
都能发挥出色的作用,特别适合于需要处理复杂数据结构、进行数据交换的场景
。
此外,marshmallow
还提供了丰富的文档和社区支持,使得学习和使用起来更加容易。因此,尽管已经有了许多其他方法,但选择marshmallow
依然是一个明智的选择,特别是在处理复杂的数据结构和场景下。
marshmallow库的基本用法
安装
要使用marshmallow
这个库,需要先安装下:
# 3.20.2
pip3 install marshmallow
A. 序列化与反序列化
marshmallow
提供了灵活且强大的数据序列化与反序列化功能,可以将复杂的Python
数据类型转换为JSON
、XML
等格式,也能反向将外部数据解析为Python
对象。
from marshmallow import Schema, fields, post_load
class Novel:
def __init__(self, title, author, genre, pages):
self.title = title
self.author = author
self.genre = genre
self.pages = pages
def __repr__(self):
return f"Novel(title={self.title!r}, author={self.author!r}, genre={self.genre!r}, pages={self.pages!r})"
class NovelSchema(Schema):
title = fields.Str()
author = fields.Str()
genre = fields.Str()
pages = fields.Int()
@post_load
def make(self, data, **kwargs):
return Novel(**data)
# 创建一个 Novel 对象
novel = Novel(title="The Enchanting Adventure", author="Jane Doe", genre="Fantasy", pages=300)
# 序列化 Novel 对象
novel_schema = NovelSchema()
serialized_data = novel_schema.dump(novel)
print(serialized_data)
# 反序列化
deserialized_data = novel_schema.load(serialized_data)
print(deserialized_data)
这里我们需要稍微区分一下
schema
的dump
方法和dumps
方法:dump()
方法返回的是dict
格式,而dumps()
方法返回的是JSON
字符串。 同理,load
方法用来加载字典,而loads
方法用来加载JSON
字符串。
让我们再来看下多个对象序列化与反序列化,同样非常简单!!
from marshmallow import Schema, fields, post_load
from loguru import logger
class Novel:
def __init__(self, title, author, genre, pages):
self.title = title
self.author = author
self.genre = genre
self.pages = pages
def __repr__(self):
return f"Novel(title={self.title!r}, author={self.author!r}, genre={self.genre!r}, pages={self.pages!r})"
class NovelSchema(Schema):
title = fields.Str()
author = fields.Str()
genre = fields.Str()
pages = fields.Int()
@post_load
def make(self, data, **kwargs):
return Novel(**data)
# 创建一个 Novel 对象
novel1 = Novel(title="海哥python1", author="暴走的海鸽", genre="Fantasy", pages=300)
novel2 = Novel(title="海哥python2", author="暴走的海鸽", genre="Fantasy", pages=300)
novel3 = Novel(title="海哥python3", author="暴走的海鸽", genre="Fantasy", pages=300)
novels = [novel1, novel2, novel3]
# 序列化 Novel 对象
novel_schema = NovelSchema(many=True)
serialized_data = novel_schema.dump(novels)
logger.info(serialized_data)
# 反序列化
deserialized_data = novel_schema.load(serialized_data)
logger.info(deserialized_data)
此外,Schema
类具有两个参数用于控制序列化的输出,即only
和exclude
。only
参数返回的输出结果仅包含列表中指定的类属性,而exclude
则正好相反,它会排除列表中指定的类属性。
from marshmallow import Schema, fields, post_load, validates, ValidationError, validate
from loguru import logger
class Novel:
def __init__(self, title, author, genre="Fantasy2", pages=300):
self.title = title
self.author = author
self.genre = genre
self.pages = pages
def __repr__(self):
return f"Novel(title={self.title!r}, author={self.author!r}, genre={self.genre!r}, pages={self.pages!r})"
class NovelSchema(Schema):
title = fields.Str(validate=validate.Length(min=1, max=10))
author = fields.Str()
genre = fields.Str()
pages = fields.Int()
@post_load
def make(self, data, **kwargs):
logger.info(data)
return Novel(**data)
@validates('pages')
def validate_pages(self, value):
if value <= 0:
raise ValidationError('Pages must be a positive integer.')
# Create a Novel object
novel = Novel(title="The Enchanting Adventure", author="Jane Doe", genre="Fantasy", pages=-300)
# Serialize the Novel object
novel_schema = NovelSchema(only=("title", "author",))
serialized_data = novel_schema.dump(novel)
logger.info(serialized_data)
B. 数据验证
数据验证是
marshmallow
的另一个重要特性,它能够定制对数据进行验证,包括类型验证
、长度验证
、自定义验证
等,保证数据的完整性和正确性。
内置的常见验证器有:
from marshmallow import Schema, fields, post_load, validates, ValidationError, validate
from loguru import logger
class Novel:
def __init__(self, title, author, genre, pages):
self.title = title
self.author = author
self.genre = genre
self.pages = pages
def __repr__(self):
return f"Novel(title={self.title!r}, author={self.author!r}, genre={self.genre!r}, pages={self.pages!r})"
class NovelSchema(Schema):
title = fields.Str(validate=validate.Length(min=1, max=10))
author = fields.Str()
genre = fields.Str()
pages = fields.Int()
@post_load
def make(self, data, **kwargs):
return Novel(**data)
@validates('pages')
def validate_pages(self, value):
if value <= 0:
raise ValidationError('Pages must be a positive integer.')
# Create a Novel object
novel = Novel(title="The Enchanting Adventure", author="Jane Doe", genre="Fantasy", pages=-300)
# Serialize the Novel object
novel_schema = NovelSchema()
serialized_data = novel_schema.dump(novel)
logger.info(serialized_data)
# Deserialization
try:
deserialized_data = novel_schema.load(serialized_data)
logger.info(deserialized_data)
except ValidationError as e:
logger.error(e.messages)
logger.error(e.valid_data)
在这个例子中,我们对title
使用了validate
字段验证,并且定义了一个validate_pages
方法,用于验证pages
字段。如果pages
字段的值小于等于0
,将会引发一个ValidationError
异常。在反序列化时,如果遇到校验失败,Marshmallow
将会捕获异常,并将校验错误信息存储在messages
属性中。
如果需要对属性进行缺失验证,则在
schema
中规定required
参数,即表明该参数是必要的,不可缺失。
from marshmallow import Schema, fields, post_load, validates, ValidationError, validate
from loguru import logger
class Novel:
def __init__(self, title, author, genre, pages):
self.title = title
self.author = author
self.genre = genre
self.pages = pages
def __repr__(self):
return f"Novel(title={self.title!r}, author={self.author!r}, genre={self.genre!r}, pages={self.pages!r})"
class NovelSchema(Schema):
title = fields.Str(validate=validate.Length(min=1, max=10))
author = fields.Str(required=True)
genre = fields.Str()
pages = fields.Int()
@post_load
def make(self, data, **kwargs):
return Novel(**data)
@validates('pages')
def validate_pages(self, value):
if value <= 0:
raise ValidationError('Pages must be a positive integer.')
# Create a Novel object
# novel = Novel(title="The Enchanting Adventure", author="Jane Doe", genre="Fantasy", pages=-300)
novel = Novel(title="The Enchanting Adventure", author="Jane Doe", genre="Fantasy", pages=-300)
# Serialize the Novel object
novel_schema = NovelSchema()
serialized_data = novel_schema.dump(novel)
logger.info(serialized_data)
# Deserialization
serialized_data.pop("author") # 移除author
try:
deserialized_data = novel_schema.load(serialized_data)
logger.info(deserialized_data)
except ValidationError as e:
logger.error(e.messages)
logger.error(e.valid_data)
我们给author
字段定义了required
属性,但是反序列化的时候并没有传入,具体报错如下:
Marshmallow
在序列化和反序列化字段方面也提供了默认值,并且非常清晰地区分它们!例如,load_default
参数用于在反序列化时自动填充数据,而dump_default
参数则用于在序列化时自动填充数据。
from marshmallow import Schema, fields, post_load, validates, ValidationError, validate
from loguru import logger
class Novel:
def __init__(self, title, author, genre, pages):
self.title = title
self.author = author
self.genre = genre
self.pages = pages
def __repr__(self):
return f"Novel(title={self.title!r}, author={self.author!r}, genre={self.genre!r}, pages={self.pages!r})"
class NovelSchema(Schema):
title = fields.Str(validate=validate.Length(min=1, max=1000))
author = fields.Str(required=True)
genre = fields.Str()
pages = fields.Int(load_default=300, dump_default=500) # 设置反序列化默认值为300,序列化默认值为500
@post_load
def make(self, data, **kwargs):
return Novel(**data)
@validates('pages')
def validate_pages(self, value):
if value <= 0:
raise ValidationError('Pages must be a positive integer.')
# Create a Novel object
novel = {"title": "公众号:海哥python", "author": "暴走的海鸽", "genre": "Fantasy"}
# Serialize the Novel object
novel_schema = NovelSchema()
serialized_data = novel_schema.dump(novel)
logger.info(f"序列化:{serialized_data}")
# Deserialization
novel2 = {"title": "公众号:海哥python", "author": "暴走的海鸽", "genre": "Fantasy"}
try:
deserialized_data = novel_schema.load(novel2)
logger.info(f"反序列化:{deserialized_data}")
except ValidationError as e:
logger.error(e.messages)
logger.error(f"合法的数据:{e.valid_data}")
在序列化过程中,
Schema
对象默认会使用与其自身定义相同的fields
属性名,但也可以根据需要进行自定义。 如果使用和生成与架构不匹配的数据,则可以通过data_key
参数指定输出键,类似于起了别名。
from marshmallow import Schema, fields, post_load, validates, ValidationError, validate
from loguru import logger
class Novel:
def __init__(self, title, author, genre, pages):
self.title = title
self.author = author
self.genre = genre
self.pages = pages
def __repr__(self):
return f"Novel(title={self.title!r}, author={self.author!r}, genre={self.genre!r}, pages={self.pages!r})"
class NovelSchema(Schema):
title = fields.Str(validate=validate.Length(min=1, max=1000))
author = fields.Str(data_key="author_name")
genre = fields.Str()
pages = fields.Int(missing=300, default=500) # 设置反序列化默认值为300,序列化默认值为500
@post_load
def make(self, data, **kwargs):
return Novel(**data)
@validates('pages')
def validate_pages(self, value):
if value <= 0:
raise ValidationError('Pages must be a positive integer.')
# Create a Novel object
novel = {"title": "公众号:海哥python", "author_name": "暴走的海鸽2", "genre": "Fantasy"}
# Serialize the Novel object
novel_schema = NovelSchema()
serialized_data = novel_schema.dump(novel)
logger.info(f"序列化:{serialized_data}")
# Deserialization
novel2 = {"title": "公众号:海哥python", "author": "暴走的海鸽", "genre": "Fantasy"}
try:
deserialized_data = novel_schema.load(novel2)
logger.info(f"反序列化:{deserialized_data}")
except ValidationError as e:
logger.error(e.messages)
logger.error(f"合法的数据:{e.valid_data}")
C. 自定义字段类型
通过
marshmallow
,我们可以轻松定义自定义字段,满足各种特殊数据类型的序列化、反序列化需求,使得数据处理更加灵活。
from datetime import datetime
from marshmallow import Schema, fields, post_load, validates, ValidationError
from loguru import logger
class CustomField(fields.Int):
def _deserialize(self, value, attr, obj, **kwargs):
# 将数字加1
return value + 1
class CustomDateField(fields.Field):
def _deserialize(self, value, attr, obj, **kwargs):
return value.strftime('%Y-%m-%d')
class Novel:
def __init__(self, title, author, genre, pages, date):
self.title = title
self.author = author
self.genre = genre
self.pages = pages
self.date = date
def __repr__(self):
return f"Novel(title={self.title!r}, author={self.author!r}, genre={self.genre!r}, pages={self.pages!r}, date={self.date!r})"
class NovelSchema(Schema):
title = fields.Str()
author = fields.Str()
genre = fields.Str()
pages = CustomField()
date = CustomDateField()
@post_load
def make(self, data, **kwargs):
return Novel(**data)
@validates('pages')
def validate_pages(self, value):
if value <= 0:
raise ValidationError('Pages must be a positive integer.')
# Create a Novel object
novel = Novel(title="The Enchanting Adventure", author="Jane Doe", genre="Fantasy", pages=300,
date=datetime(2024, 3, 13))
# Serialize the Novel object
novel_schema = NovelSchema()
serialized_data = novel_schema.dump(novel)
logger.info(serialized_data)
# Deserialization
try:
deserialized_data = novel_schema.load(serialized_data)
logger.info(deserialized_data)
except ValidationError as e:
logger.error(e.messages)
logger.error(e.valid_data)
高级应用技巧和场景
部分加载
在多个位置使用同一Schema
时,您可能只想通过传递partial
来跳过required
验证。
from marshmallow import Schema, fields
class UserSchema(Schema):
name = fields.String(required=True)
age = fields.Integer(required=True)
result = UserSchema().load({"age": 42}, partial=True)
# OR UserSchema(partial=True).load({'age': 42})
print(result) # => {'age': 42}
您可以通过设置partial=True
来完全忽略缺少的字段。
class UserSchema(Schema):
name = fields.String(required=True)
age = fields.Integer(required=True)
result = UserSchema().load({"age": 42}, partial=True)
# OR UserSchema(partial=True).load({'age': 42})
print(result) # => {'age': 42}
处理未知字段
默认情况下,如果遇到Schema
中没有匹配Field
项的键,load
将引发marshmallow.exceptions.ValidationError
。
from marshmallow import Schema, fields, INCLUDE
class UserSchema(Schema):
name = fields.String(required=True)
age = fields.Integer(required=True)
# class Meta:
# unknown = INCLUDE
result = UserSchema().load({"age": 42, "name": "公众号: 海哥python", "email": "16666@qq.com"})
# OR UserSchema(partial=True).load({'age': 42})
print(result) # => {'age': 42}
我们可以对未知字段进行处理:
- 可以在
Meta
类中指定unknown Schema
- 在实例化时:
schema = UserSchema(unknown=INCLUDE)
- 调用
load
时:UserSchema().load(data, unknown=INCLUDE)
该选项接受以下选项之一:
- RAISE (默认值): ValidationError 如果存在任何未知字段,则引发
- EXCLUDE :排除未知字段
- INCLUDE :接受并包含未知字段
dump_only
“只读”和load_only
“只写”字段
from datetime import datetime
from marshmallow import Schema, fields, INCLUDE
class UserSchema(Schema):
name = fields.Str()
# password is "write-only"
password = fields.Str(load_only=True)
# created_at is "read-only"
created_at = fields.DateTime(dump_only=True)
# 序列化
user_data = {"name": "Alice", "password": "s3cr3t", "created_at": datetime.now()}
user_schema = UserSchema()
serialized_data = user_schema.dump(user_data)
print("序列化:", serialized_data)
# 反序列化
user_input = {"name": "Bob", "password": "pass123"}
user_schema = UserSchema()
try:
deserialized_data = user_schema.load(user_input)
print("反序列化:", deserialized_data)
except Exception as e:
print("反序列化报错:", e)
排序
对于某些用例,维护序列化输出的字段顺序可能很有用。要启用排序,请将ordered
选项设置为true
。这将指示marshmallow将数据序列化到collections.OrderedDict
。
#!usr/bin/env python
# -*- coding:utf-8 _*-
# __author__:lianhaifeng
# __time__:2024/3/14 21:40
import datetime
from marshmallow import Schema, fields, INCLUDE
from collections import OrderedDict
class User:
def __init__(self, name, email):
self.name = name
self.email = email
self.created_time = datetime.datetime.now()
class UserSchema(Schema):
uppername = fields.Function(lambda obj: obj.name.upper())
class Meta:
fields = ("name", "email", "created_time", "uppername")
ordered = True
u = User("Charlie", "charlie@stones.com")
schema = UserSchema()
res = schema.dump(u)
print(isinstance(res, OrderedDict))
# True
print(res)
# OrderedDict([('name', 'Charlie'), ('email', 'charlie@stones.com'), ('created_time', '2019-08-05T20:22:05.788540+00:00'), ('uppername', 'CHARLIE')])
嵌套模式
对于嵌套属性,marshmallow
毫无疑问也能胜任,这正是我认为marshmallow
非常强大的地方。
一个Blog
可能有一个作者,由User
对象表示。
import datetime as dt
from pprint import pprint
from marshmallow import Schema, fields
class User:
def __init__(self, name, email):
self.name = name
self.email = email
self.created_at = dt.datetime.now()
self.friends = []
self.employer = None
class Blog:
def __init__(self, title, author):
self.title = title
self.author = author # A User object
class UserSchema(Schema):
name = fields.String()
email = fields.Email()
created_at = fields.DateTime()
class BlogSchema(Schema):
title = fields.String()
author = fields.Nested(UserSchema)
user = User(name="Monty", email="monty@python.org")
blog = Blog(title="Something Completely Different", author=user)
result = BlogSchema().dump(blog)
pprint(result)
# {'title': u'Something Completely Different',
# 'author': {'name': u'Monty',
# 'email': u'monty@python.org',
# 'created_at': '2014-08-17T14:58:57.600623+00:00'}}
更多嵌套玩法详见: marshmallow.readthedocs.io/en/stable/n…
扩展 Schema
预处理和后处理方法
可以使用pre_load
、post_load
、pre_dump
和post_dump
装饰器注册数据预处理和后处理方法。
from marshmallow import Schema, fields, post_load, pre_load, pre_dump, post_dump
from loguru import logger
class User:
def __init__(self, username, email):
self.username = username
self.email = email
def __repr__(self):
return f"User(username={self.username!r}, email={self.email!r})"
class UserSchema(Schema):
username = fields.Str()
email = fields.Email()
@pre_load
def preprocess_data(self, data, **kwargs):
# 在反序列化之前对数据进行预处理
if 'username' in data:
data['username'] = data['username'].lower() # 将用户名转换为小写
logger.info("do pre_load...")
return data
@post_load
def make_user(self, data, **kwargs):
# 在反序列化之后创建用户对象
logger.info("do post_load...")
return User(**data)
@pre_dump
def prepare_data(self, data, **kwargs):
# 在序列化之前对数据进行预处理
logger.info(type(data))
if isinstance(data, User):
data.username = data.username.upper()
elif 'username' in data:
data['username'] = data['username'].upper() # 将用户名转换为大写
logger.info("do pre_dump...")
return data
@post_dump
def clean_data(self, data, **kwargs):
# 在序列化之后对序列化结果进行清理
logger.info(type(data))
if 'email' in data:
del data['email'] # 删除 email 字段
logger.info("do post_dump...")
return data
# 准备要反序列化的数据
input_data = [{
"username": "公众号:海哥Python",
"email": "haige@qq.com"
}]
# 创建 Schema 对象并进行反序列化
user_schema = UserSchema()
result = user_schema.load(input_data, many=True)
logger.info(f"Post Load Result: {result}") # 输出反序列化后的结果
# 创建一个 User 对象
user = User(username="公众号:海哥Python", email="haige@qq.com")
# 序列化 User 对象
serialized_data = user_schema.dump(user)
logger.info(f"Post Dump Result: {serialized_data}") # 输出序列化后的结果
自定义错误处理
import logging
from marshmallow import Schema, fields
class AppError(Exception):
pass
class UserSchema(Schema):
email = fields.Email()
def handle_error(self, exc, data, **kwargs):
"""Log and raise our custom exception when (de)serialization fails."""
logging.error(exc.messages)
raise AppError("An error occurred with input: {0}".format(data))
schema = UserSchema()
schema.load({"email": "invalid-email"}) # raises AppError
场景示例
REST API
中基于marshmallow
做参数校验是一种相对优雅的操作。
安装:
# Flask 3.0.2
# Flask-SQLAlchemy 3.1.1
pip install flask flask-sqlalchemy
应用代码
# demo.py
import datetime
from flask import Flask, request
from flask_sqlalchemy import SQLAlchemy
from sqlalchemy.exc import NoResultFound
from marshmallow import Schema, ValidationError, fields, pre_load
app = Flask(__name__)
app.config["SQLALCHEMY_DATABASE_URI"] = "sqlite:///quotes.db"
db = SQLAlchemy(app)
##### MODELS #####
class Author(db.Model): # type: ignore
id = db.Column(db.Integer, primary_key=True)
first = db.Column(db.String(80))
last = db.Column(db.String(80))
class Quote(db.Model): # type: ignore
id = db.Column(db.Integer, primary_key=True)
content = db.Column(db.String, nullable=False)
author_id = db.Column(db.Integer, db.ForeignKey("author.id"))
author = db.relationship("Author", backref=db.backref("quotes", lazy="dynamic"))
posted_at = db.Column(db.DateTime)
##### SCHEMAS #####
class AuthorSchema(Schema):
id = fields.Int(dump_only=True)
first = fields.Str()
last = fields.Str()
formatted_name = fields.Method("format_name", dump_only=True)
def format_name(self, author):
return f"{author.last}, {author.first}"
# Custom validator
def must_not_be_blank(data):
if not data:
raise ValidationError("Data not provided.")
class QuoteSchema(Schema):
id = fields.Int(dump_only=True)
author = fields.Nested(AuthorSchema, validate=must_not_be_blank)
content = fields.Str(required=True, validate=must_not_be_blank)
posted_at = fields.DateTime(dump_only=True)
# Allow client to pass author's full name in request body
# e.g. {"author': 'Tim Peters"} rather than {"first": "Tim", "last": "Peters"}
@pre_load
def process_author(self, data, **kwargs):
author_name = data.get("author")
if author_name:
first, last = author_name.split(" ")
author_dict = dict(first=first, last=last)
else:
author_dict = {}
data["author"] = author_dict
return data
author_schema = AuthorSchema()
authors_schema = AuthorSchema(many=True)
quote_schema = QuoteSchema()
quotes_schema = QuoteSchema(many=True, only=("id", "content"))
##### API #####
@app.route("/authors")
def get_authors():
authors = Author.query.all()
# Serialize the queryset
result = authors_schema.dump(authors)
return {"authors": result}
@app.route("/authors/<int:pk>")
def get_author(pk):
try:
author = Author.query.filter(Author.id == pk).one()
except NoResultFound:
return {"message": "Author could not be found."}, 400
author_result = author_schema.dump(author)
quotes_result = quotes_schema.dump(author.quotes.all())
return {"author": author_result, "quotes": quotes_result}
@app.route("/quotes/", methods=["GET"])
def get_quotes():
quotes = Quote.query.all()
result = quotes_schema.dump(quotes, many=True)
return {"quotes": result}
@app.route("/quotes/<int:pk>")
def get_quote(pk):
try:
quote = Quote.query.filter(Quote.id == pk).one()
except NoResultFound:
return {"message": "Quote could not be found."}, 400
result = quote_schema.dump(quote)
return {"quote": result}
@app.route("/quotes/", methods=["POST"])
def new_quote():
json_data = request.get_json()
if not json_data:
return {"message": "No input data provided"}, 400
# Validate and deserialize input
try:
data = quote_schema.load(json_data)
except ValidationError as err:
return err.messages, 422
first, last = data["author"]["first"], data["author"]["last"]
author = Author.query.filter_by(first=first, last=last).first()
if author is None:
# Create a new author
author = Author(first=first, last=last)
db.session.add(author)
# Create new quote
quote = Quote(
content=data["content"], author=author, posted_at=datetime.datetime.utcnow()
)
db.session.add(quote)
db.session.commit()
result = quote_schema.dump(Quote.query.get(quote.id))
return {"message": "Created new quote.", "quote": result}
if __name__ == "__main__":
with app.app_context():
db.create_all()
app.run(debug=True, port=5000)
启动服务:
python .\demo.py
安装httpie
进行测试:
pip install httpie
添加有效报价:
添加无效报价:
查询报价:
常与
marshmallow
搭档的库还有flask-marshmallow
、flask-smorest
等
小结
总的来说,marshmallow
库具有强大的序列化、反序列化和数据验证功能,能够适用于各种复杂的数据处理场景,使得数据处理变得更加便捷和高效。
- 相比于
Python
标准库中的json
库,marshmallow
提供了更为灵活且功能强大的数据序列化与验证功能,适用于更多复杂数据结构的处理。 - 与
Django
框架中的序列化器相比,marshmallow
更加轻量且不依赖于特定的框架,能够在各种Python
项目中灵活应用。
适用于Web
开发、RESTful API
构建、数据持久化、数据验证等多种场景,尤其适合处理复杂数据结构的情况。
随着数据处理需求的日益复杂,marshmallow
作为一款强大的数据序列化与验证库,必将在Python
数据处理领域有着更为广阔的发展前景。
无论您是在开发Web API
、进行表单验证、与数据库交互,还是进行数据导入和导出,Marshmallow
都能够高效地处理数据。希望本文能够帮助您更好地理解和使用Marshmallow
!
Marshmallow
的强大远不止这些,更多使用技巧请查阅官方文档!
最后
今天的分享就到这里。如果觉得不错,点赞
,关注
安排起来吧。
参考
marshmallow.readthedocs.io/en/stable/ rest-apis-flask.teclado.com/docs/flask_… marshmallow.readthedocs.io/en/stable/n… www.cnblogs.com/ChangAn223/… blog.51cto.com/u_15703497/… www.cnblogs.com/erhuoyuan/p… github.com/marshmallow…
转载自:https://juejin.cn/post/7367195057559060492