当前位置：首页 > news >正文

Appium高级操作--从源码角度解析--模拟复杂手势操作

news 2025/10/16 19:08:31

书接上回，Android自动化--Appium基本操作-CSDN博客文章浏览阅读600次，点赞10次，收藏5次。书接上回，上一篇文章已经介绍了appium在Android端的元素定位方法和识别工具Inspector，本次要介绍使用如何利用Appium对找到的元素进行基本操作。https://blog.csdn.net/fantasy_4/article/details/146080872上一篇文章介绍了Appium的基本操作，本篇文章接着从源码的角度讲解，Appium高级操作中的模拟复杂手势操作

W3C Actions

客户端有比Web端更丰富的操作，除了单击，输入，还有长按，滑动，多指操控等。

Appium基于Selenium开发，Selenium4.0已经弃用了JSON Wire Protocol协议(定义了浏览器与Browser driver通信协议规范)，改成支持W3C WebDriver，Appium2.x版本当然也支持，并且Appium2.x版本已经不再使用TouchAction类和Multiaction类，可以用W3C Actions。

Appium模拟复杂手势流程，需要三个步骤1.确定输入源 2.创建动作 3.动作链以及执行所有动作，下面根据这三个步骤进行讲解复杂手势操作

W3C输入源类型

	`key` 输入源	`pointer` 输入源	`wheel` 输入源 ( 客户端不适用 )	`none` 输入源
设备类型	键盘	鼠标、触控板、手写笔	鼠标滚轮、触控板滚动	无设备（默认输入源）
用途	模拟键盘按键操作	模拟鼠标、触控板或手写笔的指针操作	模拟滚轮或触控板的滚动操作	用于不需要设备的操作，通常作为默认输入源
支持的操作	按下键（`keyDown`）、释放键（`keyUp`）、发送键（`sendKeys`）	移动指针（`pointerMove`、按下按钮（`pointerDown`）、释放按钮（`pointerUp`）、点击（`click`）、拖动（`dragAndDrop`）、悬停（`hover`）、双击（`doubleClick`）	垂直滚动（`scroll`）、水平滚动（`scroll`）、缩放（通过组合键，如 `Ctrl + 滚动`）	通常与其他输入源组合使用

创建动作类--ActionBuilder类部分源码解析

    def __init__(
        self,
        driver,
        mouse: Optional[PointerInput] = None,
        wheel: Optional[WheelInput] = None,
        keyboard: Optional[KeyInput] = None,
        duration: int = 250,
    ) -> None:
        mouse = mouse or PointerInput(interaction.POINTER_MOUSE, "mouse")
        keyboard = keyboard or KeyInput(interaction.KEY)
        wheel = wheel or WheelInput(interaction.WHEEL)
        self.devices = [mouse, keyboard, wheel]
        self._key_action = KeyActions(keyboard)
        self._pointer_action = PointerActions(mouse, duration=duration)
        self._wheel_action = WheelActions(wheel)
        self.driver = driver
     
        def add_pointer_input(self, kind: str, name: str) -> PointerInput:
        """Add a new pointer input device to the action builder.

        Parameters:
        ----------
        kind : str
            The kind of pointer input device.
                - "mouse"
                - "touch"
                - "pen"

        name : str
            The name of the pointer input device.

        Returns:
        --------
        PointerInput : The newly created pointer input device.

        Example:
        --------
        >>> action_builder = ActionBuilder(driver)
        >>> action_builder.add_pointer_input(kind="mouse", name="mouse")
        """
        new_input = PointerInput(kind, name)
        self._add_input(new_input)
        return new_input
        
     def _add_input(self, new_input: Union[KeyInput, PointerInput, WheelInput]) -> None:
        """Add a new input device to the action builder.

        Parameters:
        ----------
        new_input : Union[KeyInput, PointerInput, WheelInput]
            The new input device to add.
        """
        self.devices.append(new_input)

从ActionBuilder类的构造函数可以知道，创建一个ActionBuilder类的实例，必传参数：WebDriver实例，鼠标子类型默认为mouse，devices 输入源默认为[mouse, keyboard, wheel]三种类型的列表。

举例：假设新增pointer输入源（调用add_key_input），devices列表中增加new_input并返回一个PointerInput类的新创建输入源

主要方法如下：

add_key_input(name: str)：键盘输入
add_pointer_input(kind: str, name: str)：鼠标输入，上文已经提到过鼠标输入源细分类型，即kind值只能为mouse，touch，pen
add_wheel_input(name: str)：滚轮输入--客户端不涉及
perform()：执行动作
clear_actions()：清除已经加入device列表的动作

动作链以及执行动作--ActionChains类部分源码解析

使用W3C Action模拟用户操作流程，需要使用ActionChains类，即动作链以及执行所有动作。

"""ActionChains are a way to automate low level interactions such as mouse
movements, mouse button actions, key press, and context menu interactions.
This is useful for doing more complex actions like hover over and drag and
drop.

Generate user actions.
   When you call methods for actions on the ActionChains object,
   the actions are stored in a queue in the ActionChains object.
   When you call perform(), the events are fired in the order they
   are queued up.
"""

大概意思是说：

1.ActionChain 是一种自动化低级别的交互方法，例如移动、鼠标按钮作、按键和上下文菜单交互等，这对于执行更复杂的作（如将鼠标悬停在上面并拖动）很有用。

2.当你要调用ActionChains object的操作行为的方法时，这些操作行为会存储在ActionChains object的队列中，调用 perform（） 时，事件将按它们的顺序触发。

也就是说，调用ActionChain类的一系列操作方法之后，还需要最后调用 perform()方法，去一一执行操作

AnyDevice = Union[PointerInput, KeyInput, WheelInput]

def __init__(self, driver: WebDriver, duration: int = 250, devices: list[AnyDevice] | None = None) -> None:
        """Creates a new ActionChains.

        :Args:
         - driver: The WebDriver instance which performs user actions.
         - duration: override the default 250 msecs of DEFAULT_MOVE_DURATION in PointerInput
        """
        self._driver = driver
        mouse = None
        keyboard = None
        wheel = None
        if devices is not None and isinstance(devices, list):
            for device in devices:
                if isinstance(device, PointerInput):
                    mouse = device
                if isinstance(device, KeyInput):
                    keyboard = device
                if isinstance(device, WheelInput):
                    wheel = device
        self.w3c_actions = ActionBuilder(driver, mouse=mouse, keyboard=keyboard, wheel=wheel, duration=duration)

由ActionBuilder类和ActionChains类的部分源码可知：
创建ActionChains类的实例，必须传入WebDriver实例，PointerInput的duration默认为250毫秒（可不传），输入源devices可以为Union[PointerInput, KeyInput, WheelInput]中任意类型，默认为None（可不传）
当devices列表为None时，创建ActionBuilder实例的device参数默认为[mouse, keyboard, wheel]
创建ActionChains类的属性self.w3c_actions为ActionBuilder类的实现

总结流程

下面总结Appium模拟复杂手势整体流程

创建ActionChains类实例action时，一定要传入WebDriver实例参数，
创建实例成功后，调用w3c_actions属性（ActionBuilder实例）将devices置为空列表，不使用默认的[mouse, keyboard, wheel]，因为客户端鼠标操作子类型为touch不能使用mouse
调用w3c_actions的add_XXX_input方法将新输入源加入到devices列表中，并返回XXXInput类的新输入源
调用新输入源（new_input）的各种操作方法（例如鼠标按下按钮，释放按钮，鼠标移动等）（这块源码就不细讲了可以自行查看）
如此加入一些列输入操作后，ActionChains类的实例调用perform()方法，实际上调用就是ActionBuilder的perform()方法执行一系列操作

举例实现

长按操作

from time import sleep

from appium import webdriver
from appium.options.android import UiAutomator2Options
from appium.webdriver.common.appiumby import AppiumBy
from selenium.webdriver import ActionChains
from selenium.webdriver.common.actions.mouse_button import MouseButton

desired_caps = {
  "platformName": "Android",
  "appium:deviceName": "Q7PDU19731008305",
  "appium:appPackage": "com.dangdang.buy2",
  "appium:appActivity": "com.dangdang.buy2.activity.ActivityMainTab",
  "appium:automationName": "UiAutomator2",
  "newCommandTimeout": 300
}

print("Desired Capabilities: ", desired_caps)

try:
    driver = webdriver.Remote("http://localhost:4723", options=UiAutomator2Options().load_capabilities(desired_caps))

    if driver is None:
        raise RuntimeError("Appium driver initialization failed!")

    # 定位单击购物车
    driver.find_element(AppiumBy.ID, "'com.dangdang.buy2:id/tab_cart_iv'").click()
    sleep(3)
    # 获取屏幕窗口大小
    size_dict = driver.get_window_size()
    # 创建ActionChains类实例
    actions = ActionChains(driver)
    # 输入设备列表置空
    actions.w3c_actions.devices = []
    
    # ======手指长按===========
    new_input = actions.w3c_actions.add_pointer_input('touch', 'finger0')
    # 输入源动作：移动到某点。使用相对位置：x的0.5 y的0.3
    new_input.create_pointer_move(x=size_dict['width'] * 0.5, y=size_dict['height'] * 0.3)
    # 按住鼠标左键
    new_input.create_pointer_down(button=MouseButton.LEFT)
    # 等待2秒，模拟长按操作，单位是秒
    new_input.create_pause(2)
    # 松开鼠标左键
    new_input.create_pointer_up(button=MouseButton.LEFT)
    # 执行操作
    actions.perform()
    sleep(3)

finally:
    # 关闭 Appium 会话
    driver.quit()

左滑操作

from time import sleep

from appium import webdriver
from appium.options.android import UiAutomator2Options
from appium.webdriver.common.appiumby import AppiumBy
from selenium.webdriver.common.action_chains import ActionChains
from selenium.webdriver import ActionChains
from selenium.webdriver.common.actions.mouse_button import MouseButton

desired_caps = {
    "platformName": "Android",
    "deviceName": "Q7PDU19731008305",
    "appPackage": "com.sankuai.movie",
    "appActivity": "com.sankuai.movie.MovieMainActivity",
    "automationName": "UiAutomator2"
}

print("Desired Capabilities: ", desired_caps)

try:
    driver = webdriver.Remote("http://localhost:4723", options=UiAutomator2Options().load_capabilities(desired_caps))

    if driver is None:
        raise RuntimeError("Appium driver initialization failed!")
    # click()点击操作 # 同意条款
    driver.implicitly_wait(10)
    driver.find_element(AppiumBy.ANDROID_UIAUTOMATOR, 'new UiSelector().resourceId("com.sankuai.movie:id/cyf")').click(

    # ========左滑操作=======
    size_dict = driver.get_window_size()
    actions = ActionChains(driver)
    # 设备列表置为空
    actions.w3c_actions.devices = []

    new_input = actions.w3c_actions.add_pointer_input('touch', 'finger0')
    
    # ======手指从屏幕X轴0.7的位置左滑到X轴0.2的位置，Y轴是屏幕的0.6=======
    new_input.create_pointer_move(x=size_dict['width']*0.7, y=size_dict['height']*0.6)
    sleep(3)
    # 按下鼠标左键
    new_input.create_pointer_down(button=MouseButton.LEFT)
    # 等待2秒，模拟滑动
    new_input.create_pause(0.2)
    # ======手指从屏幕X轴0.7的位置左滑到X轴0.2的位置，Y轴是屏幕的0.6=======
    new_input.create_pointer_move(x=size_dict['width']*0.2, y=size_dict['height']*0.6)
    sleep(3)
    # 松开鼠标左键
    new_input.create_pointer_up(button=0)
    # 执行动作
    actions.perform()

    sleep(3)

finally:
    # 关闭 Appium 会话
    driver.quit()

多指触控

    '''
    只给出双指放大图片部分代码
    '''
    
    # 获取屏幕窗口大小
    size_dict = driver.get_window_size()
    # 创建ActionChains类实例
    actions = ActionChains(driver)
    # 输入设备列表置空
    actions.w3c_actions.devices = []

    # ======第一根手指：从正中心向右上角滑动===========
    # 添加一个新的输入源到设备列表中，输入源类型为Touch,id为finger0
    new_input = actions.w3c_actions.add_pointer_input('touch', 'finger0')
    # 输入源动作：移动到某点。使用相对位置：x的0.5 y的0.5
    new_input.create_pointer_move(x=size_dict['width'] * 0.5, y=size_dict['height'] * 0.5)
    # 按住鼠标左键
    new_input.create_pointer_down(button=MouseButton.LEFT)
    # 等待2秒，模拟长按操作，单位是秒
    new_input.create_pause(0.2)
    # 输入源动作：移动到某点。使用相对位置：x的0.9 y的0.1
    new_input.create_pointer_move(x=size_dict['width'] * 0.9, y=size_dict['height'] * 0.1)
    # 松开鼠标左键
    new_input.create_pointer_up(button=MouseButton.LEFT)

    # =====第二根手指：从正中心向左下角滑动========
    # 添加一个新的输入源到设备列表，输入源类型为Touch,id为finger1
    new_input = actions.w3c_actions.add_pointer_input('touch', 'finger1')
    # 输入源动作：移动到某点。使用相对位置：x的0.5 y的0.5
    new_input.create_pointer_move(x=size_dict['width'] * 0.5, y=size_dict['height'] * 0.5)
    # 按住鼠标左键
    new_input.create_pointer_down(button=MouseButton.LEFT)
    # 等待2秒，模拟长按操作，单位是秒
    new_input.create_pause(0.2)
    # 输入源动作：移动到某点。使用相对位置：x的0.1 y的0.9
    new_input.create_pointer_move(x=size_dict['width'] * 0.1, y=size_dict['height'] * 0.9)
    # 松开鼠标左键
    new_input.create_pointer_up(button=MouseButton.LEFT)

    # 执行操作
    actions.perform()
    sleep(3)

下文章将继续Appium的其他高级操作的讲解，比如toast元素识别，hybrid App操作，屏幕截图等~敬请期待哦

查看全文

http://www.dtcms.com/a/67684.html