ml.models.lora

Helper utilities for using LoRA layers.

LoRA layers are drop-in replacements for certain modules, which can be used for fine-tuning pre-trained models. It is described in the paper LoRA: Low-Rank Adaptation of Large Language Models.

from ml.models.lora import lora

# The pre-trained model weights can be loaded into the LoRA model.
model = nn.Sequential(nn.Linear(5, 7), nn.Linear(7, 5))
lora_model = nn.Sequential(lora(nn.Linear(5, 7)), lora(nn.Linear(7, 5)))
lora_model.load_state_dict(model.state_dict())  # No errors

from ml.models.lora import LoRALinear

# Alternatively, you can just substitute the module name.
model = nn.Sequential(LoRALinear(5, 7), LoRALinear(7, 5))

The modules which can be wrapped with LoRA modules are:

  • nn.Embedding

  • nn.Linear

  • nn.Conv1d

  • nn.ConvTranspose1d

  • nn.Conv2d

  • nn.ConvTranspose2d

  • nn.LSTM

  • nn.GRU

  • ColumnParallelLinear

  • RowParallelLinear

  • ParallelEmbedding

In the paper, the authors typically use values of 1, 2, 4, or 8 for the r parameter. The lora_alpha parameter is typically set to 1.0, but can be tuned to improve performance.

class ml.models.lora.LoraEmbedding(num_embeddings: int, embedding_dim: int, r: int, lora_alpha: float = 1.0, lora_dropout: float = 0.0, merge: bool = False, padding_idx: int | None = None, max_norm: float | None = None, norm_type: float = 2.0, scale_grad_by_freq: bool = False, sparse: bool = False)[source]

Bases: Embedding, _Lora

Initializes internal Module state, shared by both nn.Module and ScriptModule.

reset_parameters() None[source]
reset_lora_parameters() None[source]

Resets LoRA parameters in-place.

train(mode: bool = True) LoraEmbedding[source]

Sets the module in training mode.

This has any effect only on certain modules. See documentations of particular modules for details of their behaviors in training/evaluation mode, if they are affected, e.g. Dropout, BatchNorm, etc.

Parameters:

mode (bool) – whether to set training mode (True) or evaluation mode (False). Default: True.

Returns:

self

Return type:

Module

forward(x: Tensor) Tensor[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class ml.models.lora.LoraLinear(in_features: int, out_features: int, r: int, lora_alpha: float = 1.0, lora_dropout: float = 0.0, fan_in_fan_out: bool = False, merge: bool = False, bias: bool = True)[source]

Bases: Linear, _Lora

Initializes internal Module state, shared by both nn.Module and ScriptModule.

reset_parameters() None[source]
reset_lora_parameters() None[source]

Resets LoRA parameters in-place.

train(mode: bool = True) LoraLinear[source]

Sets the module in training mode.

This has any effect only on certain modules. See documentations of particular modules for details of their behaviors in training/evaluation mode, if they are affected, e.g. Dropout, BatchNorm, etc.

Parameters:

mode (bool) – whether to set training mode (True) or evaluation mode (False). Default: True.

Returns:

self

Return type:

Module

forward(x: Tensor) Tensor[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class ml.models.lora.LoraConv1d(in_channels: int, out_channels: int, kernel_size: int | tuple[int], r: int, lora_alpha: float = 1.0, lora_dropout: float = 0.0, merge: bool = False, stride: int | tuple[int] = 1, padding: str | int | tuple[int] = 0, dilation: int | tuple[int] = 1, groups: int = 1, bias: bool = True)[source]

Bases: Conv1d, _Lora

Initializes internal Module state, shared by both nn.Module and ScriptModule.

reset_parameters() None[source]
reset_lora_parameters() None[source]

Resets LoRA parameters in-place.

train(mode: bool = True) LoraConv1d[source]

Sets the module in training mode.

This has any effect only on certain modules. See documentations of particular modules for details of their behaviors in training/evaluation mode, if they are affected, e.g. Dropout, BatchNorm, etc.

Parameters:

mode (bool) – whether to set training mode (True) or evaluation mode (False). Default: True.

Returns:

self

Return type:

Module

forward(x: Tensor) Tensor[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class ml.models.lora.LoraConvTranspose1d(in_channels: int, out_channels: int, kernel_size: int | tuple[int], r: int, lora_alpha: float = 1.0, lora_dropout: float = 0.0, merge: bool = False, stride: int | tuple[int] = 1, padding: int | tuple[int] = 0, output_padding: int | tuple[int] = 0, dilation: int | tuple[int] = 1, groups: int = 1, bias: bool = True)[source]

Bases: ConvTranspose1d, _Lora

Initializes internal Module state, shared by both nn.Module and ScriptModule.

reset_parameters() None[source]
reset_lora_parameters() None[source]

Resets LoRA parameters in-place.

train(mode: bool = True) LoraConvTranspose1d[source]

Sets the module in training mode.

This has any effect only on certain modules. See documentations of particular modules for details of their behaviors in training/evaluation mode, if they are affected, e.g. Dropout, BatchNorm, etc.

Parameters:

mode (bool) – whether to set training mode (True) or evaluation mode (False). Default: True.

Returns:

self

Return type:

Module

forward(x: Tensor, output_size: list[int] | None = None) Tensor[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class ml.models.lora.LoraConv2d(in_channels: int, out_channels: int, kernel_size: int | tuple[int, int], r: int, lora_alpha: float = 1.0, lora_dropout: float = 0.0, merge: bool = False, stride: int | tuple[int, int] = (1, 1), padding: str | int | tuple[int, int] = (0, 0), dilation: int | tuple[int, int] = (1, 1), groups: int = 1, bias: bool = True)[source]

Bases: Conv2d, _Lora

Initializes internal Module state, shared by both nn.Module and ScriptModule.

reset_parameters() None[source]
reset_lora_parameters() None[source]

Resets LoRA parameters in-place.

train(mode: bool = True) LoraConv2d[source]

Sets the module in training mode.

This has any effect only on certain modules. See documentations of particular modules for details of their behaviors in training/evaluation mode, if they are affected, e.g. Dropout, BatchNorm, etc.

Parameters:

mode (bool) – whether to set training mode (True) or evaluation mode (False). Default: True.

Returns:

self

Return type:

Module

forward(x: Tensor) Tensor[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class ml.models.lora.LoraConvTranspose2d(in_channels: int, out_channels: int, kernel_size: int | tuple[int, int], r: int, lora_alpha: float = 1.0, lora_dropout: float = 0.0, merge: bool = False, stride: int | tuple[int, int] = (1, 1), padding: int | tuple[int, int] = (0, 0), output_padding: int | tuple[int, int] = (0, 0), dilation: int | tuple[int, int] = (1, 1), groups: int = 1, bias: bool = True)[source]

Bases: ConvTranspose2d, _Lora

Initializes internal Module state, shared by both nn.Module and ScriptModule.

reset_parameters() None[source]
reset_lora_parameters() None[source]

Resets LoRA parameters in-place.

train(mode: bool = True) LoraConvTranspose2d[source]

Sets the module in training mode.

This has any effect only on certain modules. See documentations of particular modules for details of their behaviors in training/evaluation mode, if they are affected, e.g. Dropout, BatchNorm, etc.

Parameters:

mode (bool) – whether to set training mode (True) or evaluation mode (False). Default: True.

Returns:

self

Return type:

Module

forward(x: Tensor, output_size: list[int] | None = None) Tensor[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class ml.models.lora.LoraLSTM(input_size: int, hidden_size: int, r: int, lora_alpha: float = 1.0, num_layers: int = 1, bias: bool = True, batch_first: bool = False, dropout: float = 0.0, bidirectional: bool = False, proj_size: int = 0)[source]

Bases: LSTM, _LoraRNN

Initializes internal Module state, shared by both nn.Module and ScriptModule.

class ml.models.lora.LoraGRU(input_size: int, hidden_size: int, r: int, lora_alpha: float = 1.0, num_layers: int = 1, bias: bool = True, batch_first: bool = False, dropout: float = 0.0, bidirectional: bool = False, proj_size: int = 0)[source]

Bases: GRU, _LoraRNN

Initializes internal Module state, shared by both nn.Module and ScriptModule.

class ml.models.lora.LoraLSTMCell(input_size: int, hidden_size: int, r: int, bias: bool = True, lora_alpha: float = 1.0)[source]

Bases: LSTMCell, _LoraRNNCellBase

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(input: Tensor, hx: tuple[torch.Tensor, torch.Tensor] | None = None) tuple[torch.Tensor, torch.Tensor][source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class ml.models.lora.LoraGRUCell(input_size: int, hidden_size: int, r: int, bias: bool = True, lora_alpha: float = 1.0)[source]

Bases: GRUCell, _LoraRNNCellBase

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(input: Tensor, hx: Tensor | None = None) Tensor[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class ml.models.lora.LoraParallelEmbedding(num_embeddings: int, embedding_dim: int, r: int, lora_alpha: float = 1.0, lora_dropout: float = 0.0, merge: bool = False, padding_idx: int | None = None, max_norm: float | None = None, norm_type: float = 2.0, scale_grad_by_freq: bool = False, sparse: bool = False, init_type: Literal['orthogonal', 'normal', 'biased_normal', 'uniform', 'kaiming_uniform', 'kaiming_normal', 'xavier_uniform', 'xavier_normal', 'trunc_normal', 'dirac', 'constant', 'zeros', 'ones'] = 'xavier_normal')[source]

Bases: ParallelEmbedding, _Lora

Model-parallel embeddings.

Embeddings are partitioned along the embedding_dim dimension.

Parameters:
  • num_embeddings – Number of embeddings (vocabulary size).

  • embedding_dim – Embedding dimension; must be divisible by the model-parallel size.

  • padding_idx – See nn.Embedding.

  • max_norm – See nn.Embedding.

  • norm_type – See nn.Embedding.

  • scale_grad_by_freq – See nn.Embedding.

  • sparse – See nn.Embedding.

  • init_type – Initialization type.

reset_parameters() None[source]
reset_lora_parameters() None[source]

Resets LoRA parameters in-place.

train(mode: bool = True) LoraParallelEmbedding[source]

Sets the module in training mode.

This has any effect only on certain modules. See documentations of particular modules for details of their behaviors in training/evaluation mode, if they are affected, e.g. Dropout, BatchNorm, etc.

Parameters:

mode (bool) – whether to set training mode (True) or evaluation mode (False). Default: True.

Returns:

self

Return type:

Module

forward(x: Tensor) Tensor[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class ml.models.lora.LoraColumnParallelLinear(in_features: int, out_features: int, r: int, lora_alpha: float = 1.0, lora_dropout: float = 0.0, fan_in_fan_out: bool = False, merge: bool = False, bias: bool = True, gather_output: bool = True, init_type: Literal['orthogonal', 'normal', 'biased_normal', 'uniform', 'kaiming_uniform', 'kaiming_normal', 'xavier_uniform', 'xavier_normal', 'trunc_normal', 'dirac', 'constant', 'zeros', 'ones'] = 'xavier_normal', stride: int = 1)[source]

Bases: ColumnParallelLinear, _Lora

A column parallel linear layer.

This layer splits the weight matrix along the output feature dimension, and each rank is only responsible for out_features // world_size number of output features.

Parameters:
  • in_features – Number of input features.

  • out_features – Number of output features.

  • bias – Whether to include a bias term.

  • gather_output – Whether to gather the output from all the model parallel GPUs.

  • init_type – Initialization type.

  • stride – Stride for the initialization.

  • lora_rank – The LoRA rank to use, if any.

reset_parameters() None[source]
reset_lora_parameters() None[source]

Resets LoRA parameters in-place.

train(mode: bool = True) LoraColumnParallelLinear[source]

Sets the module in training mode.

This has any effect only on certain modules. See documentations of particular modules for details of their behaviors in training/evaluation mode, if they are affected, e.g. Dropout, BatchNorm, etc.

Parameters:

mode (bool) – whether to set training mode (True) or evaluation mode (False). Default: True.

Returns:

self

Return type:

Module

forward(x: Tensor) Tensor[source]

Forward method.

Parameters:

x – input tensor of size (*, in_features)

Returns:

Output tensor of size (*, out_features // world_size), or (*, out_features) if gather_output is set to True.

class ml.models.lora.LoraRowParallelLinear(in_features: int, out_features: int, r: int, lora_alpha: float = 1.0, lora_dropout: float = 0.0, fan_in_fan_out: bool = False, merge: bool = False, bias: bool = True, input_is_parallel: bool = False, init_type: Literal['orthogonal', 'normal', 'biased_normal', 'uniform', 'kaiming_uniform', 'kaiming_normal', 'xavier_uniform', 'xavier_normal', 'trunc_normal', 'dirac', 'constant', 'zeros', 'ones'] = 'xavier_normal', stride: int = 1)[source]

Bases: RowParallelLinear, _Lora

A row parallel linear layer.

This layer splits the weight matrix along the input feature dimension, and each rank is only responsible for in_features // world_size number of input features.

This can be paired with a column parallel layer to create a model parallel two-stage linear layer.

Parameters:
  • in_features – Number of input features.

  • out_features – Number of output features.

  • bias – Whether to include a bias term.

  • input_is_parallel – Whether the input tensor is already split along the feature dimension.

  • init_type – Initialization type.

  • stride – Stride for the initialization.

reset_parameters() None[source]
reset_lora_parameters() None[source]

Resets LoRA parameters in-place.

train(mode: bool = True) LoraRowParallelLinear[source]

Sets the module in training mode.

This has any effect only on certain modules. See documentations of particular modules for details of their behaviors in training/evaluation mode, if they are affected, e.g. Dropout, BatchNorm, etc.

Parameters:

mode (bool) – whether to set training mode (True) or evaluation mode (False). Default: True.

Returns:

self

Return type:

Module

forward(x: Tensor) Tensor[source]

Forward method.

Parameters:

x – input tensor of size (*, in_features), or (*, in_features // world_size) if input_is_parallel is set to True.

Returns:

Output tensor of size (*, out_features).

ml.models.lora.lora(module: Embedding, r: int, alpha: float = 1.0, dropout: float = 0.0, merge: bool = False) LoraEmbedding[source]
ml.models.lora.lora(module: Linear, r: int, alpha: float = 1.0, dropout: float = 0.0, merge: bool = False) LoraLinear
ml.models.lora.lora(module: Conv1d, r: int, alpha: float = 1.0, dropout: float = 0.0, merge: bool = False) LoraConv1d
ml.models.lora.lora(module: ConvTranspose1d, r: int, alpha: float = 1.0, dropout: float = 0.0, merge: bool = False) LoraConv1d
ml.models.lora.lora(module: Conv2d, r: int, alpha: float = 1.0, dropout: float = 0.0, merge: bool = False) LoraConv2d
ml.models.lora.lora(module: ConvTranspose2d, r: int, alpha: float = 1.0, dropout: float = 0.0, merge: bool = False) LoraConv2d
ml.models.lora.lora(module: LSTM, r: int, alpha: float = 1.0, dropout: float = 0.0, merge: bool = False) LoraLSTM
ml.models.lora.lora(module: GRU, r: int, alpha: float = 1.0, dropout: float = 0.0, merge: bool = False) LoraGRU
ml.models.lora.lora(module: LSTMCell, r: int, alpha: float = 1.0, dropout: float = 0.0, merge: bool = False) LoraLSTMCell
ml.models.lora.lora(module: GRUCell, r: int, alpha: float = 1.0, dropout: float = 0.0, merge: bool = False) LoraGRUCell
ml.models.lora.lora(module: ParallelEmbedding, r: int, alpha: float = 1.0, dropout: float = 0.0, merge: bool = False) LoraParallelEmbedding
ml.models.lora.lora(module: ColumnParallelLinear, r: int, alpha: float = 1.0, dropout: float = 0.0, merge: bool = False) LoraColumnParallelLinear
ml.models.lora.lora(module: RowParallelLinear, r: int, alpha: float = 1.0, dropout: float = 0.0, merge: bool = False) LoraRowParallelLinear
ml.models.lora.lora(module: Embedding | Linear | Conv1d | ConvTranspose1d | Conv2d | ConvTranspose2d | LSTM | GRU | LSTMCell | GRUCell | ColumnParallelLinear | RowParallelLinear | ParallelEmbedding, r: int, alpha: float = 1.0, dropout: float = 0.0, merge: bool = False) Module

Wraps a module with LoRA.

This function takes a base module and returns the LoRA version of that module. The new module is effectively a drop-in replacement for the original module; for example, it can load the same state dict, and it has the same input and output shapes.

Parameters:
  • module – The module to wrap.

  • r – The number of LoRA components to use. If 0, then LoRA is not used.

  • alpha – The scaling factor for the LoRA components. A higher value means that more weight is given to the LoRA components.

  • dropout – The dropout probability applied to the input value before computing the LoRA components. This parameter is not supported for RNNs (because it would require modifying the underyling kernel).

  • merge – Whether to merge the LoRA components into the original weights. If True, then the LoRA components are merged into the weights during training, and the original weights are used during evaluation. If False, then the LoRA components are used during both training and evaluation.

Returns:

The LoRA version of the module.

Raises:

ValueError – If the module is not supported.

ml.models.lora.maybe_lora(module: T_module, r: int | None, alpha: float = 1.0, dropout: float = 0.0, merge: bool = False, freeze: bool = True) T_module[source]

Apply LoRA to a supported module, if a LoRA rank is provided.

Parameters:
  • module – A supported module.

  • r – The LoRA rank.

  • alpha – The LoRA alpha parameter.

  • dropout – The LoRA dropout rate.

  • merge – Whether to merge the LoRA rank into the input dimension.

  • freeze – Whether to freeze the module’s parameters if a LoRA rank is not provided. This argument has no effect if a LoRA rank is provided, since downstream users can always freeze just the module themselves. Typically, when trying out LoRA fine-tuning, downstream users will want to freeze most of the module parameters and apply LoRA only to a subset of the module’s layers, so this is the default behavior.

Returns:

The module with LoRA applied, if a LoRA rank is provided.

ml.models.lora.maybe_lora_weight_norm(module: T_module, r: int | None, alpha: float = 1.0, dropout: float = 0.0, merge: bool = False, freeze: bool = True) T_module[source]
ml.models.lora.reset_lora_weights_(module: Module) None[source]

Resets any LoRA weights in the module.

All of the LoRA modules have a reset_lora_parameters method that will reset the LoRA weights in-place. This function looks for any modules with this method and calls it.

Parameters:

module – The module to reset, in-place.

ml.models.lora.freeze_non_lora_(module: Module) None[source]

Freezes any non-LoRA parameters in the module.

Parameters:

module – The module to freeze, in-place.