summaryrefslogtreecommitdiff
path: root/intel/intel_bufmgr_gem.c
AgeCommit message (Collapse)Author
2010-11-07intel: initialize bufmgr.bo_mrb_exec unconditionallyAlbert Damen
If bufmgr.bo_mrb_exec is not set, drm_intel_bo_mrb_exec returns ENODEV even though drm_intel_gem_bo_mrb_exec2 will work fine for the RENDER ring. Fixes xf86-video-intel after commit 'add BLT ring support' (5bed685f76) with kernels without BSD or BLT ring support (2.6.34 and before). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=31443 Signed-off-by: Albert Damen <albrt@gmx.net> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-11-02intel: Drop silly asserts on mappings present at unmap time.Eric Anholt
The intent of these was to catch mismatched map/unmap. What it actually did was check whether there was ever a mapping of that type (including in a previous life of the buffer through the userland BO cache), not whether they were mismatched. We don't even actually want to catch mismatched map/unmap, unless we also do refcounting, since at one point Mesa would do map/map/use/unmap/unmap. Just remove this code instead.
2010-11-02intel: Remove gratuitous assert on bo_reference.Eric Anholt
This couldn't be triggered except by overflow, since there's an assert in unreference to catch the usual failure of over-unreferencing.
2010-11-01intel: Remove stale comment.Eric Anholt
2010-10-29intel: enable relaxed fence allocation for i915Chris Wilson
The kernel has always allowed userspace to underallocate objects supplied for fencing. However, the kernel only allocated the object size for the fence in the GTT and so caused tiling corruption. More recently the kernel does allocate the full fence region in the GTT for an under-sized object and so advertises that clients may finally make use of this feature. The biggest benefit is for texture-heavy GL games on i945 such as World of Padman which go from needing over 1GiB of RAM to play to fitting in the GTT! Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-10-26intel: Prepare for BLT ring split.Chris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-10-01intel: Downgrade error warnings to debugChris Wilson
As the higher layers check the error return from libdrm-intel and are supposed to handle the error (and print their own warning in extremis) the voluminous output on stderr is just noise and a hazard in its own right. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-09-25intel: Replace open-coded drmIoctl with calls to drmIoctl()Chris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-06-29intel: Suppress the error return from setting domains after mapping.Chris Wilson
If the mapping succeeds we have a valid pointer. If setting the domain failures we may incur cache corruption. However the usual failure mode is because of a hung GPU, in which case it is preferable to ignore the minor error from setting the domain and continue on oblivious. If these errors persist, we should rate limit the warning [or even just remove it]. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-06-24intel: Limit tiled pitches to 8192 on pre-i965.Chris Wilson
Fixes: Bug 28515 - Failed to allocate framebuffer when exceed 2048 width https://bugs.freedesktop.org/show_bug.cgi?id=28515 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-06-22intel: Only adjust the local stride used for SET_TILING in tiled allocChris Wilson
Mesa uses the returned pitch from alloc_tiled, so make sure that we set it correctly before modifying the stride used for the SET_TILING call. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-06-22intel: Restore SET_TILING for non-flinked bo.Chris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-06-22intel: '===' != '=='Chris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-06-22intel: Sanitise strides for linear buffers and SET_TILINGChris Wilson
Ensure that the user doesn't attempt to specify a stride to use with a linear buffer by forcing such to be zero. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-06-21intel: Print out debugging message following ENOSPCChris Wilson
execbuffer() returns ENOSPC if it cannot fit the batch buffer into the aperture which is the error we want to diagnose here. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-06-21intel: Scan the cache for old bo once every second.Chris Wilson
Rearrange the cache cleanup so that we always scan following a final unreference, and guard against multiple scans in a single second. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-06-21intel: Force stride to be 0 for I915_TILING_NONE.Chris Wilson
When allocating a tiled buffer, if we remove the desired tiling mode due to it being beyond hardware limits, also remove the stride. This ensures that we only ever use stride 0 with I915_TILING_NONE. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-06-21intel: Defer tiling change to allocation.Chris Wilson
As we now expose a method to allocate tiled buffers, it makes more sense to defer the SET_TILING until required. Besides the slim chance that it will be a no-op, by delaying the change we are less likely to stall on waiting for a bound buffer to release a fence register. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-06-21intel: Track tiling strideChris Wilson
We need to inform the kernel if the tiling stride changes and not only for changes of the tiling mode. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-06-10intel: Fix several other paths for buffers pointing at themselves.Eric Anholt
2010-06-10intel: Add more intermediate sizes of cache buckets between powers of 2.Eric Anholt
We had two cases recently where the rounding to powers of two hurt badly: 4:2:0 YUV HD video frames would round up from 2.2MB to 4MB, and Urban Terror was hitting aperture size limitations. For UT, this is because mipmap trees for power of two texture sizes will land right in the middle between two cache buckets. By giving a few more sizes between powers of two, Urban Terror on my 945 ends up consuming 207MB of GEM objects instead of 272MB, and HD video decode on Ironlake goes from 99MB to 75MB. cairo-perf-diff of the benchmarks for gl and xlib shows a 1.09x and 1.06x speedup and a 1.07x, 1.08x, and 1.11x slowdown. From this, I think this patch was really a no-op in terms of performance for these CPU-bound workloads.
2010-06-09intel: Convert to untiled pitches if surface is too large for tiling.Chris Wilson
If the pitch is too large for the hardware to tile, recompute the required surface size based on the untiled pitch and alignments. For the older hardware, which has smaller limits and greater restrictions, this may be a considerable saving in allocation size. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-06-07Allow a buffer to point at itself and still get relocs.Eric Anholt
I'm using this in experiments with the i965 Mesa driver.
2010-06-06intel: Add support for kernel multi-ringbuffer API.Zou Nan hai
This introduces a new API to exec on BSD ring buffer, for H.264 VLD decoding. Signed-off-by: Xiang Hai hao <haihao.xiang@intel.com> Signed-off-by: Zou Nan hai <nanhai.zou@intel.com>
2010-05-24intel: Don't change tiling mode unless the kernel reports success.Chris Wilson
Fixes: Bug 26686 - Some textures are distorted with libdrm 2.4.18 in GTAVC&GTA3 http://bugs.freedesktop.org/show_bug.cgi?id=26686 This bug continues to haunt me. The kernel SET_TILING ioctl is inconsistent in its return values when reporting an error. If one of its sanity checks fail, then the input values are left unchanged. If the kernel later fails to change the tiling mode, then the input values are modified to match the current tiling on the object. In short, userspace cannot trust the return values upon error and so we must assume that upon error our current tiling mode matches reality and not update.
2010-05-13Revert "intel: We don't need to take the bufmgr lock whilst mapping."Chris Wilson
This reverts commit 7ca558494dd3f68f29bb6ca981de9b8f49620b60. This was pushed ahead of an essential review of bo level locking in mesa, without which we cannot know whether removing this lock is safe.
2010-05-11intel: query whether a buffer is reusable.Chris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-06intel: We don't need to take the bufmgr lock whilst mapping.Chris Wilson
2010-04-11intel: Use the correct size when allocating reloc_target_info arrayChris Wilson
Thomas tracked down this error with kdm and commit b509640: ==4320== Invalid write of size 8 ==4320== at 0x9A97998: do_bo_emit_reloc (in /usr/lib/libdrm_intel.so.1.0.0) ==4320== by 0x9A97B9C: drm_intel_gem_bo_emit_reloc (in /usr/lib/libdrm_intel.so.1.0.0) ==4320== by 0xAED3234: intel_batchbuffer_emit_reloc (in /usr/lib/xorg/modules/dri/i965_dri.so) ==4320== by 0xAF13827: brw_emit_vertices (in /usr/lib/xorg/modules/dri/i965_dri.so) ==4320== by 0xAF1F14D: brw_upload_state (in /usr/lib/xorg/modules/dri/i965_dri.so) ==4320== by 0xAF12122: brw_draw_prims (in /usr/lib/xorg/modules/dri/i965_dri.so) ==4320== by 0xB256824: vbo_exec_vtx_flush (in /usr/lib/xorg/modules/dri/libdricore.so) ==4320== by 0xB2523BB: vbo_exec_FlushVertices_internal (in /usr/lib/xorg/modules/dri/libdricore.so) ==4320== by 0xB252411: vbo_exec_FlushVertices (in /usr/lib/xorg/modules/dri/libdricore.so) ==4320== by 0xB195A3D: _mesa_PopAttrib (in /usr/lib/xorg/modules/dri/libdricore.so) ==4320== by 0x8DF0F02: __glXDisp_Render (in /usr/lib/xorg/modules/extensions/libglx.xorg) ==4320== by 0x8DF517F: __glXDispatch (in /usr/lib/xorg/modules/extensions/libglx.xorg) ==4320== Address 0x126a8b80 is 0 bytes after a block of size 16,368 alloc'd ==4320== at 0x4C23E03: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==4320== by 0x9A97A64: do_bo_emit_reloc (in /usr/lib/libdrm_intel.so.1.0.0) ==4320== by 0x9A97B9C: drm_intel_gem_bo_emit_reloc (in /usr/lib/libdrm_intel.so.1.0.0) ==4320== by 0xAED3234: intel_batchbuffer_emit_reloc (in /usr/lib/xorg/modules/dri/i965_dri.so) ==4320== by 0xAF191DB: upload_binding_table_pointers (in /usr/lib/xorg/modules/dri/i965_dri.so) ==4320== by 0xAF1F14D: brw_upload_state (in /usr/lib/xorg/modules/dri/i965_dri.so) ==4320== by 0xAF12122: brw_draw_prims (in /usr/lib/xorg/modules/dri/i965_dri.so) ==4320== by 0xB255EF6: vbo_exec_DrawArrays (in /usr/lib/xorg/modules/dri/libdricore.so) ==4320== by 0x8DF67A3: __glXDisp_DrawArrays (in /usr/lib/xorg/modules/extensions/libglx.xorg) ==4320== by 0x8DF0F02: __glXDisp_Render (in /usr/lib/xorg/modules/extensions/libglx.xorg) ==4320== by 0x8DF517F: __glXDispatch (in /usr/lib/xorg/modules/extensions/libglx.xorg) ==4320== by 0x446293: ??? (in /usr/bin/Xorg) which is simply due to only allocating space for the pointers and not the structs themselves. D'oh. Reported-by: Thomas Bächler <thomas@archlinux.org> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-03-17intel: Align untiled buffer pitch to 64B.Eric Anholt
This is the largest untiled pitch requirement from gen2 through gen4. It's only the case for gen3 rendering to color regions with depth, but it's rare for this to be a significant factor in memory usage -- for example, gen4 requires 1 or 2 times the element size, or up to 64 bytes depending on the size of the elements. This is easier than encoding all the various little quirks for untiled pitch alignment, since we rarely do untiled now.
2010-03-17libdrm: Move intel_atomic.h to libdrm core for sharing.Pauli Nieminen
intel_atomic.h includes very usefull atomic operations for lock free parrallel access of variables. Moving these to core libdrm for code sharing with radeon. Signed-off-by: Pauli Nieminen <suokkos@gmail.com>
2010-03-07intel: Repeat execbuffer if interrupted by signalChris Wilson
Repeat while EINTR, not EAGAIN! One more source of corruption erradicated, hurray! Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-03-04intel: Only align Y-tiling pitch to the Y tile width.Eric Anholt
Fixes piglit depth-tex-modes on gen4.
2010-03-04intel: Propagate some more error returnsChris Wilson
Ensure that errors from the kernel are propagated back to the caller, and not masked with return 0; Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-03-03intel: Update the needs_fence flag of buffers on the validate list.Eric Anholt
Fixes fbo-copyteximage on i915 with texture tiling and execbuf2 fenced relocs.
2010-03-02intel: Don't enable execbuf2 fenced relocs unless we have execbuf2.Eric Anholt
2010-03-02intel: Don't tile-align pitch for untiled buffers.Eric Anholt
This allows Mesa to use drm_intel_bo_alloc_tiled() for its tiled buffers, since it makes its decision about pitch before telling libdrm. They happen to be the same choices for the tiled case.
2010-03-02intel: Fix typo in conversion from IS_GEN to bufmgr_gem->gen.Eric Anholt
Luckily I caught the bug with the first consumer of the interface.
2010-03-02intel: add a comment about tiled buffer alloc height alignment from Mesa.Eric Anholt
2010-03-02intel: Use an integer for chipset generation instead of many conditionals.Eric Anholt
Saves a bunch of comparisons in hot paths.
2010-03-02libdrm/intel: execbuf2 supportJesse Barnes
This patch to libdrm adds support for the new execbuf2 ioctl. If detected, it will be used instead of the old ioctl. By using the new drm_intel_bufmgr_gem_enable_fenced_relocs(), you can indicate that any time a fence register is actually required for a relocation target you will call drm_intel_bo_emit_reloc_fence instead of drm_intel_bo_emit_reloc, which will reduce fence register pressure. Signed-off-by: Eric Anholt <eric@anholt.net>
2010-02-25intel: Add initial support for Sandybridge, and clean up the #defines.Eric Anholt
2010-02-10intel: Handle resetting of input params after EINTR during SET_TILINGChris Wilson
The SET_TILING is pernicious in that it overwrites the input arguments following an error in order to report the current tiling state of the buffer. This caught us by surprise as we then fed those arguments back into to the ioctl unmodified following an EINTR and so the kernel then reported success for the no-op. We interpreted this success as meaning that the tiling on the buffer had changed so updated our state and started using the buffer incorrectly in the new tiled/untiled manner. This lead to all sorts of random corruption and GPU hangs, even though the batch buffers would look sane (when the GPU had not wandered off into forbidden territory). References: Bug 25475 - [i915] Xorg crash / Execbuf while wedged http://bugs.freedesktop.org/show_bug.cgi?id=25475 Bug 25554 - i830_uxa_prepare_access: gtt bo map failed: Input/output error http://bugs.freedesktop.org/show_bug.cgi?id=25554 (And probably every other weird bug in the last few months.) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-02-09intel: Account for potential pinned buffers hogging fencesChris Wilson
As the kernel reports the total number of fences, we must guess how many fences are likely to be pinned. In the typical system these will be only used by the scanout buffers, of which there may be one per pipe, and any number of manually pinned fenced buffers. So take a conservative guess and reserve two fences for use by the system. Note this reduces the number of fences to 3 for i915 and prior. Reference: http://bugs.freedesktop.org/show_bug.cgi?id=25911 The latest intel driver 2.10.0 causes kernel oops and system hangs Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-02-02intel: check return value for callocDave Airlie
2009-12-08intel: Clear virtual after failing to mmap_gtt.Chris Wilson
Don't store the error return in bo_gem->gtt_virtual or else we will attempt to use that as a valid pointer in future mappings. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2009-12-05intel: Expect caller to guarantee thread-safety of bo during relocChris Wilson
This removes the foremost prolific user of mutexes in libdrm_intel.so. The other uses of the bufmgr_gem->mutex to serial access to individual bos are currently required by Mesa, and are far less frequent. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> [anholt: This chunk looks good...] Acked-by: Eric Anholt <eric@anholt.net>
2009-12-02intel: Free memory before inserting bo into cache.Chris Wilson
This has the unfortunate behaviour of releasing our malloc cache, but the alternative is for X to consume a couple of gigabytes of ram and die during testing. Fortunately the extra mallocs have little impact on performance whereas avoiding swap and death, lots. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2009-12-02intel: Check and propagate errors from building reloc-treeChris Wilson
Instead of forcing the caller to check after every emit_reloc(), we can flag the object as being in error, propagating that error upwards through the relocation tree, and failing the eventual batch buffer execution. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2009-12-02intel: Repeat execbuffer after EINTRChris Wilson
EAGAIN cannot be raised by the current code, but the system call maybe interrupted and so return EINTR. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>