Merged
Conversation
006a3fb to
ccfd9de
Compare
ccfd9de to
291d568
Compare
903955b to
bb2711f
Compare
…mpletion in abort path" This reverts commit 0367076b0817d5c75dfb83001ce7ce5c64d803a9. The commit being reverted added code to __qla2x00_abort_all_cmds() to call sp->done() without holding a spinlock. But unlike the older code below it, this new code failed to check sp->cmd_type and just assumed TYPE_SRB, which results in a jump to an invalid pointer in target-mode with TYPE_TGT_CMD: qla2xxx [0000:65:00.0]-d034:8: qla24xx_do_nack_work create sess success 0000000009f7a79b qla2xxx [0000:65:00.0]-5003:8: ISP System Error - mbx1=1ff5h mbx2=10h mbx3=0h mbx4=0h mbx5=191h mbx6=0h mbx7=0h. qla2xxx [0000:65:00.0]-d01e:8: -> fwdump no buffer qla2xxx [0000:65:00.0]-f03a:8: qla_target(0): System error async event 0x8002 occurred qla2xxx [0000:65:00.0]-00af:8: Performing ISP error recovery - ha=0000000058183fda. BUG: kernel NULL pointer dereference, address: 0000000000000000 PF: supervisor instruction fetch in kernel mode PF: error_code(0x0010) - not-present page PGD 0 P4D 0 Oops: 0010 [#1] SMP CPU: 2 PID: 9446 Comm: qla2xxx_8_dpc Tainted: G O 6.1.133 #1 Hardware name: Supermicro Super Server/X11SPL-F, BIOS 4.2 12/15/2023 RIP: 0010:0x0 Code: Unable to access opcode bytes at 0xffffffffffffffd6. RSP: 0018:ffffc90001f93dc8 EFLAGS: 00010206 RAX: 0000000000000282 RBX: 0000000000000355 RCX: ffff88810d16a000 RDX: ffff88810dbadaa8 RSI: 0000000000080000 RDI: ffff888169dc38c0 RBP: ffff888169dc38c0 R08: 0000000000000001 R09: 0000000000000045 R10: ffffffffa034bdf0 R11: 0000000000000000 R12: ffff88810800bb40 R13: 0000000000001aa8 R14: ffff888100136610 R15: ffff8881070f7400 FS: 0000000000000000(0000) GS:ffff88bf80080000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffffffffffd6 CR3: 000000010c8ff006 CR4: 00000000003706e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> ? __die+0x4d/0x8b ? page_fault_oops+0x91/0x180 ? trace_buffer_unlock_commit_regs+0x38/0x1a0 ? exc_page_fault+0x391/0x5e0 ? asm_exc_page_fault+0x22/0x30 __qla2x00_abort_all_cmds+0xcb/0x3e0 [qla2xxx_scst] qla2x00_abort_all_cmds+0x50/0x70 [qla2xxx_scst] qla2x00_abort_isp_cleanup+0x3b7/0x4b0 [qla2xxx_scst] qla2x00_abort_isp+0xfd/0x860 [qla2xxx_scst] qla2x00_do_dpc+0x581/0xa40 [qla2xxx_scst] kthread+0xa8/0xd0 </TASK> Then commit 4475afa2646d ("scsi: qla2xxx: Complete command early within lock") added the spinlock back, because not having the lock caused a race and a crash. But qla2x00_abort_srb() in the switch below already checks for qla2x00_chip_is_down() and handles it the same way, so the code above the switch is now redundant and still buggy in target-mode. Remove it. Cc: stable@vger.kernel.org Signed-off-by: Tony Battersby <tonyb@cybernetics.com> Link: https://patch.msgid.link/3a8022dc-bcfd-4b01-9f9b-7a9ec61fa2a3@cybernetics.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> [ commit b57fbc88715b upstream ]
When given the module parameter qlini_mode=exclusive, qla2xxx in initiator mode is initially unable to successfully send SCSI commands to devices it finds while scanning, resulting in an escalating series of resets until an adapter reset clears the issue. Fix by checking the active mode instead of the module parameter. Signed-off-by: Tony Battersby <tonyb@cybernetics.com> Link: https://patch.msgid.link/1715ec14-ba9a-45dc-9cf2-d41aa6b81b5e@cybernetics.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> [ commit 8f58fc64d559 upstream ]
When qla2xxx is loaded with qlini_mode=disabled,
ha->flags.disable_msix_handshake is used before it is set, resulting in
the wrong interrupt handler being used on certain HBAs
(qla2xxx_msix_rsp_q_hs() is used when qla2xxx_msix_rsp_q() should be
used). The only difference between these two interrupt handlers is that
the _hs() version writes to a register to clear the "RISC" interrupt,
whereas the other version does not. So this bug results in the RISC
interrupt being cleared when it should not be. This occasionally causes
a different interrupt handler qla24xx_msix_default() for a different
vector to see ((stat & HSRX_RISC_INT) == 0) and ignore its interrupt,
which then causes problems like:
qla2xxx [0000:02:00.0]-d04c:6: MBX Command timeout for cmd 20,
iocontrol=8 jiffies=1090c0300 mb[0-3]=[0x4000 0x0 0x40 0xda] mb7 0x500
host_status 0x40000010 hccr 0x3f00
qla2xxx [0000:02:00.0]-101e:6: Mailbox cmd timeout occurred, cmd=0x20,
mb[0]=0x20. Scheduling ISP abort
(the cmd varies; sometimes it is 0x20, 0x22, 0x54, 0x5a, 0x5d, or 0x6a)
This problem can be reproduced with a 16 or 32 Gbps HBA by loading
qla2xxx with qlini_mode=disabled and running a high IOPS test while
triggering frequent RSCN database change events.
While analyzing the problem I discovered that even with
disable_msix_handshake forced to 0, it is not necessary to clear the
RISC interrupt from qla2xxx_msix_rsp_q_hs() (more below). So just
completely remove qla2xxx_msix_rsp_q_hs() and the logic for selecting
it, which also fixes the bug with qlini_mode=disabled.
The test below describes the justification for not needing
qla2xxx_msix_rsp_q_hs():
Force disable_msix_handshake to 0:
qla24xx_config_rings():
if (0 && (ha->fw_attributes & BIT_6) && (IS_MSIX_NACK_CAPABLE(ha)) &&
(ha->flags.msix_enabled)) {
In qla24xx_msix_rsp_q() and qla2xxx_msix_rsp_q_hs(), check:
(rd_reg_dword(®->host_status) & HSRX_RISC_INT)
Count the number of calls to each function with HSRX_RISC_INT set and
the number with HSRX_RISC_INT not set while performing some I/O.
If qla2xxx_msix_rsp_q_hs() clears the RISC interrupt (original code):
qla24xx_msix_rsp_q: 50% of calls have HSRX_RISC_INT set
qla2xxx_msix_rsp_q_hs: 5% of calls have HSRX_RISC_INT set
(# of qla2xxx_msix_rsp_q_hs interrupts) =
(# of qla24xx_msix_rsp_q interrupts) * 3
If qla2xxx_msix_rsp_q_hs() does not clear the RISC interrupt (patched
code):
qla24xx_msix_rsp_q: 100% of calls have HSRX_RISC_INT set
qla2xxx_msix_rsp_q_hs: 9% of calls have HSRX_RISC_INT set
(# of qla2xxx_msix_rsp_q_hs interrupts) =
(# of qla24xx_msix_rsp_q interrupts) * 3
In the case of the original code, qla24xx_msix_rsp_q() was seeing
HSRX_RISC_INT set only 50% of the time because qla2xxx_msix_rsp_q_hs()
was clearing it when it shouldn't have been. In the patched code,
qla24xx_msix_rsp_q() sees HSRX_RISC_INT set 100% of the time, which
makes sense if that interrupt handler needs to clear the RISC interrupt
(which it does). qla2xxx_msix_rsp_q_hs() sees HSRX_RISC_INT only 9% of
the time, which is just overlap from the other interrupt during the
high IOPS test.
Tested with SCST on:
QLE2742 FW:v9.08.02 (32 Gbps 2-port)
QLE2694L FW:v9.10.11 (16 Gbps 4-port)
QLE2694L FW:v9.08.02 (16 Gbps 4-port)
QLE2672 FW:v8.07.12 (16 Gbps 2-port)
both initiator and target mode
Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
Link: https://patch.msgid.link/56d378eb-14ad-49c7-bae9-c649b6c7691e@cybernetics.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
[ commit 4f6aaade2a22 upstream ]
If a mailbox command completes immediately after wait_for_completion_timeout() times out, ha->mbx_intr_comp could be left in an inconsistent state, causing the next mailbox command not to wait for the hardware. Fix by reinitializing the completion before use. Signed-off-by: Tony Battersby <tonyb@cybernetics.com> Link: https://patch.msgid.link/11b6485e-0bfd-4784-8f99-c06a196dad94@cybernetics.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> [ commit 957aa5974989 upstream ]
As far as I can tell, CONTINUE_TGT_IO_TYPE and CTIO_A64_TYPE are message types from non-FWI2 boards (older than ISP24xx), which are not supported by qla_target.c. Removing them makes it possible to turn a void * into the real type and avoid some typecasts. Signed-off-by: Tony Battersby <tonyb@cybernetics.com> Link: https://patch.msgid.link/cb006628-e321-4e30-a60b-08b37b8685a5@cybernetics.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> [ commit 9da4e1dcea46 upstream ]
Print better debug info when terminating a command, and print the response status from the hardware. Signed-off-by: Tony Battersby <tonyb@cybernetics.com> Link: https://patch.msgid.link/22f8a0b6-0e24-474d-9f28-9d65c9b7af03@cybernetics.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> [ commit c34e373f535e upstream ]
Properly set the nport_handle field of the terminate exchange message. Previously when this field was not set properly, the term exchange would fail when cmd_sent_to_fw == 1 but work when cmd_sent_to_fw == 0 (i.e. it would fail when the HW was actively transferring data or status for the cmd but work when the HW was idle). With this change, term exchange works in any cmd state, which now makes it possible to abort a command that is locked up in the HW. Signed-off-by: Tony Battersby <tonyb@cybernetics.com> Link: https://patch.msgid.link/1a221699-969b-4f28-8ea4-395d2f7a7c0a@cybernetics.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> [ commit ed382b95f5de upstream ]
Commit aefed3e5548f ("scsi: qla2xxx: target: Fix offline port handling
and host reset handling") caused two problems:
1. Commands sent to FW, after chip reset got stuck and never freed as FW
is not going to respond to them anymore.
2. BUG_ON(cmd->sg_mapped) in qlt_free_cmd(). Commit 26f9ce53817a
("scsi: qla2xxx: Fix missed DMA unmap for aborted commands")
attempted to fix this, but introduced another bug under different
circumstances when two different CPUs were racing to call
qlt_unmap_sg() at the same time: BUG_ON(!valid_dma_direction(dir)) in
dma_unmap_sg_attrs().
So revert "scsi: qla2xxx: Fix missed DMA unmap for aborted commands" and
partially revert "scsi: qla2xxx: target: Fix offline port handling and
host reset handling" at __qla2x00_abort_all_cmds.
Fixes: aefed3e5548f ("scsi: qla2xxx: target: Fix offline port handling and host reset handling")
Fixes: 26f9ce53817a ("scsi: qla2xxx: Fix missed DMA unmap for aborted commands")
Co-developed-by: Dmitry Bogdanov <d.bogdanov@yadro.com>
Signed-off-by: Dmitry Bogdanov <d.bogdanov@yadro.com>
Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
Link: https://patch.msgid.link/0e7e5d26-e7a0-42d1-8235-40eeb27f3e98@cybernetics.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
[ commit d46c69a087aa upstream ]
cmd->cmd_lock only protects cmd->aborted, but when deciding how to process a cmd, it is necessary to consider other factors such as cmd->state and if the chip has been reset, which are protected by qpair->qp_lock_ptr. So replace cmd_lock with qp_lock_ptr, whick makes it possible to check additional values and make decisions about what to do without racing with the CTIO handler and other code. - Lock cmd->qpair->qp_lock_ptr when aborting a cmd. - Eliminate cmd->cmd_lock and change cmd->aborted to a bitfield since it is now protected by qp_lock_ptr just like all the other flags. - Add another command state QLA_TGT_STATE_DONE to avoid any possible races between qlt_abort_cmd() and tgt_ops->free_cmd(). - Add the cmd->sent_term_exchg flag to indicate if qlt_send_term_exchange() has already been called. - Export qlt_send_term_exchange() for SCST so that it can be called directly instead of trying to make qlt_abort_cmd() work for both TMR abort and HW timeout. Signed-off-by: Tony Battersby <tonyb@cybernetics.com> Link: https://patch.msgid.link/2c8d03e4-308b-4d5a-a418-a334be23f815@cybernetics.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> [ commit 17488f139074 upstream ]
sqa_on_hw_pending_cmd_timeout() currently unmaps DMA, sets outstanding_cmds[h] to NULL, and forces the command to complete. This could cause a kernel crash if the HW later accesses the DMA mapping. It can also cause other problems if outstanding_cmds[h] is reused for a different command. Fix by doing this instead: - In sqa_on_hw_pending_cmd_timeout(), call qlt_send_term_exchange() first and then restart the timeout. After another timeout, reset the ISP. Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
…y_to_xfer Similar fixes to both functions: qlt_xmit_response: - If the cmd cannot be processed, remember to call ->free_cmd() to prevent the target-mode midlevel from seeing a cmd lockup. - Do not try to send the response if the exchange has been terminated. - Check for chip reset once after lock instead of both before and after lock. - Give errors from qlt_pre_xmit_response() a lower priority to compensate for removing the first check for chip reset. qlt_rdy_to_xfer: - Check for chip reset after lock instead of before lock to avoid races. - Do not try to receive data if the exchange has been terminated. - Give errors from qlt_pci_map_calc_cnt() a lower priority to compensate for moving the check for chip reset. Signed-off-by: Tony Battersby <tonyb@cybernetics.com> Link: https://patch.msgid.link/cd6ccd31-33fa-4454-be36-507bf578a546@cybernetics.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> [ commit 5c50d84798eb upstream ]
(target mode) If handle_tmr() fails: - The code for QLA_TGT_ABTS results in memory-use-after-free and double-free: qlt_do_tmr_work() qlt_build_abts_resp_iocb() qpair->req->outstanding_cmds[h] = (srb_t *)mcmd; mempool_free(mcmd, qla_tgt_mgmt_cmd_mempool); FIRST FREE qlt_handle_abts_completion() mcmd = qlt_ctio_to_cmd() cmd = req->outstanding_cmds[h]; return cmd; vha = mcmd->vha; USE-AFTER-FREE ha->tgt.tgt_ops->free_mcmd(mcmd); SECOND FREE - qlt_send_busy() makes no sense because it sends a SCSI command response instead of a TMR response. Instead just call qlt_xmit_tm_rsp() to send a TMR failed response, since that code is well-tested and handles a number of corner cases. But it would be incorrect to call ha->tgt.tgt_ops->free_mcmd() after handle_tmr() failed, so add a flag to mcmd indicating the proper way to free the mcmd so that qlt_xmit_tm_rsp() can be used for both cases. Signed-off-by: Tony Battersby <tonyb@cybernetics.com> Link: https://patch.msgid.link/09a1ff3d-6738-4953-a31b-10e89c540462@cybernetics.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> [ commit 3d56983cc6f0 upstream ]
struct atio7_fcp_cmnd is a variable-length data structure because of add_cdb_len, but it is embedded in struct atio_from_isp and copied around like a fixed-length data structure. For big CDBs > 16 bytes, get_datalen_for_atio() called on a fixed-length copy of the atio will access invalid memory. In some cases this can be fixed by moving the atio to the end of the data structure and using a variable-length allocation. In other cases such as allocating struct qla_tgt_cmd, the fixed-length data structures are preallocated for speed, so in the case that add_cdb_len != 0, allocate a separate buffer for the CDB. Also add memcpy_atio() as a safeguard against invalid memory accesses. Signed-off-by: Tony Battersby <tonyb@cybernetics.com> Link: https://patch.msgid.link/306a9d0b-3c89-42fc-a69c-eebca8171347@cybernetics.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> [ commit 091719c21d5a upstream ]
Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
Add cmd->rsp_sent to indicate that the SCSI status has been sent successfully, so that SCST can be informed of any transport errors. This will also be used for logging in later patches. Signed-off-by: Tony Battersby <tonyb@cybernetics.com> Link: https://patch.msgid.link/d4b0203f-7817-4517-9789-5866bb24fad7@cybernetics.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> [ commit f4199d581256 upstream ]
Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
- Add the command tag to various messages so that different messages about the same command can be correlated. - For CTIO errors (i.e. when the HW reports an error about a cmd), print the cmd tag, opcode, state, initiator WWPN, and LUN. This info helps an administrator determine what is going wrong. - When a command experiences a transport error, log a message when it is freed. This makes debugging exceptions easier. Signed-off-by: Tony Battersby <tonyb@cybernetics.com> Link: https://patch.msgid.link/c579987d-5658-41ae-9653-f0e58c9d1880@cybernetics.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> [ commit 04957d8c9852 upstream ]
Background: loading qla2xxx with "ql2xtgt_tape_enable=1" enables
Sequence Level Error Recovery (SLER), which is most commonly used for
tape drives. With SLER enabled, if there is a recoverable I/O error
during a SCSI command, a Sequence Retransmission Request (SRR) will be
used to retry the I/O at a low-level completely within the driver
without propagating the error to the upper levels of the SCSI stack.
SRR support was removed in 2017 by commit 2c39b5ca2a8c ("qla2xxx: Remove
SRR code"). Add it back, new and improved.
The old removed SRR code used sequence numbers to correlate the SRR
CTIOs with SRR immediate notify messages. I don't see how that would
work reliably with MSI-X interrupts and multiple queues. So instead use
the exchange address to find the command associated with the immediate
notify (qlt_srr_to_cmd).
The old removed SRR code had a function qlt_check_srr_debug() to
simulate a SRR, but it didn't work for me. Instead I just used fiber
optic attenuators attached to the FC cable to reduce the strength of the
signal and induce errors. Unfortunately this only worked for inducing
SRRs on Data-Out (write) commands, so that is all I was able to test.
The code to build a new scatterlist for a SRR with nonzero offset has
been improved to reduce memory requirements and has been well-tested.
However it does not support protection information.
When a single cmd gets multiple SRRs, the old removed SRR code would
restore the data buffer from the values in cmd->se_cmd before processing
the new SRR. That might be needed if the offset for the new SRR was
lower than the offset for the previous SRR, but I am not sure if that
can happen. In my testing, when a single cmd gets multiple SRRs, the
SRR offset always increases or stays the same. But in case it can
decrease, I added the function qlt_restore_orig_sg(). If this is not
supposed to happen then qlt_restore_orig_sg() can be removed to simplify
the code.
I ran into some HBA firmware bugs with QLE269x, QLE27xx, and QLE28xx
firmware 9.05.xx - 9.08.xx where a SRR would cause the HBA to misbehave
badly. Since SRRs are rare and therefore difficult to test, I figured
it would be worth checking for the buggy firmware and disabling SLER
with a warning instead of letting others run into the same problem on
the rare occasion that they get a SRR. This turned out to be difficult
because the firmware version isn't known in the normal NVRAM config
routine, so I added a second NVRAM config routine that is called after
the firmware version is known.
Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
Link: https://patch.msgid.link/654b7181-b79e-40ed-a15b-6d6e441a5d5f@cybernetics.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
[ commit c7bd85a7b9c5 upstream ]
Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
The driver associates two different structs with numeric handles and passes the handles to the hardware. When the hardware passes the handle back to the driver, the driver consults a table of void * to convert the handle back to the struct without checking the type of struct. This can lead to type confusion if the HBA firmware misbehaves (and some firmware versions do). So verify the type of struct is what is expected before using it. But we can also do better than that. Also verify that the exchange address of the message sent from the hardware matches the exchange address of the command being returned. This adds an extra guard against buggy HBA firmware that returns duplicate messages multiple times (which has also been seen) in case the driver has reused the handle for a different command of the same type. These problems were seen on a QLE2694L with firmware 9.08.02 when testing SLER / SRR support. The SRR caused the HBA to flood the response queue with hundreds of bogus entries. Signed-off-by: Tony Battersby <tonyb@cybernetics.com> Link: https://patch.msgid.link/7c7cb574-fe62-42ae-b800-d136d8dd89ca@cybernetics.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> [ commit 4f5eb50f7c82 upstream ]
This enables the initiator to abort commands that are stuck pending in the HW without waiting for a timeout. Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
bb2711f to
4da5957
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This patch series improves the qla2xxx FC driver in target mode.
These patches have been well-tested at my employer with qla2xxx+SCST in both initiator mode and target mode and with a variety of FC HBAs and initiators.
Summary of patches:
Some of these patches improve handling of aborts and task management requests.
This is some of the testing that I did:
Test 1: Use /dev/sg to queue random disk I/O with short timeouts; make sure cmds are aborted successfully.
Test 2: Queue lots of disk I/O, then use "sg_reset -N -d /dev/sg" on initiator to reset logical unit.
Test 3: Queue lots of disk I/O, then use "sg_reset -N -t /dev/sg" on initiator to reset target.
Test 4: Queue lots of disk I/O, then use "sg_reset -N -b /dev/sg" on initiator to reset bus.
Test 5: Queue lots of disk I/O, then use "sg_reset -N -H /dev/sg" on initiator to reset host.
Test 6: Use fiber channel attenuator to trigger SRR during write/read/compare test; check data integrity.
With my patches, SCST passes all of these tests.
Tony Battersby
https://www.cybernetics.com/